GPT-4 scored in the top 10% on a simulated bar exam

web

SSRN·papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: SSRN

Frequently cited as evidence of frontier LLM capability jumps; relevant to discussions of AI capability evaluation, deployment risks, and how quickly AI systems are approaching or exceeding human professional performance thresholds.

Metadata

Importance: 72/100working paperprimary source

Summary

This paper (likely OpenAI's GPT-4 technical report or related analysis) documents GPT-4's performance on the simulated Uniform Bar Exam, where it scored in the top 10% of test takers, alongside strong results on other standardized professional and academic benchmarks. It demonstrates a significant capability leap over prior language models on legally and academically rigorous tasks.

Key Points

•GPT-4 scored in approximately the top 10% on a simulated Uniform Bar Exam, compared to GPT-3.5 which scored near the bottom 10%.
•The model was evaluated across a wide range of standardized exams including LSAT, GRE, SAT, and various AP exams, showing broad professional-level competence.
•Results highlight rapid capability scaling, with GPT-4 substantially outperforming its predecessor on tasks requiring nuanced reasoning and domain knowledge.
•The findings raise important questions about AI deployment in high-stakes professional domains like law, medicine, and finance.
•Such benchmark performance is relevant to both capability assessment and discussions about AI safety, misuse risks, and governance.

Cited by 1 page

Page	Type	Quality
Mainstream Era	Historical	42.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202612 KB

[Skip to main content](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233#maincontent "Skip to content")

[![PDF icon](https://static.ssrn.com/cfincludes/img/icons/icon-adobe-pdf.svg)Download This Paper](https://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID4392741_code627779.pdf?abstractid=4389233&mirid=1)

[Open PDF in Browser](https://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID4392741_code627779.pdf?abstractid=4389233&mirid=1&type=2)

[Add Paper to My Library](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233#)

Share:


- [Share on Facebook](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Share on Facebook")
- [Share on Twitter](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Share on Twitter")
- [Share by Email](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Share by Email")
- [Get Permalink](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Get Permalink")

Permalink

Using these links will ensure access to this page indefinitely

[Copy URL](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Copy Permalink URL")

[Copy DOI](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Copy DOI URL")

# GPT-4 Passes the Bar Exam

382 Philosophical Transactions of the Royal Society A (2024)

35 PagesPosted: 15 Mar 2023Last revised: 3 Apr 2024

[See all articles by Daniel Martin Katz](https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=627779 "View other papers by this author")

## [Daniel Martin Katz](https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=627779 "View other papers by this author")

Illinois Tech - Chicago Kent College of Law; Bucerius Center for Legal Technology & Data Science; Stanford CodeX - The Center for Legal Informatics; 273 Ventures; ALEA Institute

## [Michael James Bommarito](https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=817068 "View other papers by this author")

273 Ventures; ALEA Institute; Stanford Center for Legal Informatics; Michigan State College of Law; Bommarito Consulting, LLC

## [Shang Gao](https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=5783831 "View other papers by this author")

Casetext

## [Pablo Arredondo](https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=5783833 "View other papers by this author")

Casetext; Stanford CodeX

Date Written: March 15, 2023

### Abstract

In this paper, we experimentally evaluate the zero-shot performance of a preliminary version of GPT-4 against prior generations of GPT on the entire Uniform Bar Examination (UBE), including not only the multiple-choice Multistate Bar Examination (MBE), but also the open-ended Multistate Essay Exam (MEE) and Multistate Performance Test (MPT) components. On the MBE, GPT-4 significantly outperforms both human test-takers and prior models, demonstrating a 26% increase over ChatGPT and beating humans in five of seven subject areas. On the MEE and MPT, which have not previously been evaluated by scholars, GPT-4 scores an average 

... (truncated, 12 KB total)

Resource ID: e7e2f9d13842946b | Stable ID: sid_Chcs9k11GZ