GPT-4 scored in the top 10% on a simulated bar exam
webCredibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: SSRN
Frequently cited as evidence of frontier LLM capability jumps; relevant to discussions of AI capability evaluation, deployment risks, and how quickly AI systems are approaching or exceeding human professional performance thresholds.
Metadata
Summary
This paper (likely OpenAI's GPT-4 technical report or related analysis) documents GPT-4's performance on the simulated Uniform Bar Exam, where it scored in the top 10% of test takers, alongside strong results on other standardized professional and academic benchmarks. It demonstrates a significant capability leap over prior language models on legally and academically rigorous tasks.
Key Points
- •GPT-4 scored in approximately the top 10% on a simulated Uniform Bar Exam, compared to GPT-3.5 which scored near the bottom 10%.
- •The model was evaluated across a wide range of standardized exams including LSAT, GRE, SAT, and various AP exams, showing broad professional-level competence.
- •Results highlight rapid capability scaling, with GPT-4 substantially outperforming its predecessor on tasks requiring nuanced reasoning and domain knowledge.
- •The findings raise important questions about AI deployment in high-stakes professional domains like law, medicine, and finance.
- •Such benchmark performance is relevant to both capability assessment and discussions about AI safety, misuse risks, and governance.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Mainstream Era | Historical | 42.0 |
Cached Content Preview
[Skip to main content](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233#maincontent "Skip to content")
[Download This Paper](https://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID4392741_code627779.pdf?abstractid=4389233&mirid=1)
[Open PDF in Browser](https://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID4392741_code627779.pdf?abstractid=4389233&mirid=1&type=2)
[Add Paper to My Library](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233#)
Share:
- [Share on Facebook](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Share on Facebook")
- [Share on Twitter](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Share on Twitter")
- [Share by Email](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Share by Email")
- [Get Permalink](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Get Permalink")
Permalink
Using these links will ensure access to this page indefinitely
[Copy URL](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Copy Permalink URL")
[Copy DOI](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233# "Copy DOI URL")
# GPT-4 Passes the Bar Exam
382 Philosophical Transactions of the Royal Society A (2024)
35 PagesPosted: 15 Mar 2023Last revised: 3 Apr 2024
[See all articles by Daniel Martin Katz](https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=627779 "View other papers by this author")
## [Daniel Martin Katz](https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=627779 "View other papers by this author")
Illinois Tech - Chicago Kent College of Law; Bucerius Center for Legal Technology & Data Science; Stanford CodeX - The Center for Legal Informatics; 273 Ventures; ALEA Institute
## [Michael James Bommarito](https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=817068 "View other papers by this author")
273 Ventures; ALEA Institute; Stanford Center for Legal Informatics; Michigan State College of Law; Bommarito Consulting, LLC
## [Shang Gao](https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=5783831 "View other papers by this author")
Casetext
## [Pablo Arredondo](https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=5783833 "View other papers by this author")
Casetext; Stanford CodeX
Date Written: March 15, 2023
### Abstract
In this paper, we experimentally evaluate the zero-shot performance of a preliminary version of GPT-4 against prior generations of GPT on the entire Uniform Bar Examination (UBE), including not only the multiple-choice Multistate Bar Examination (MBE), but also the open-ended Multistate Essay Exam (MEE) and Multistate Performance Test (MPT) components. On the MBE, GPT-4 significantly outperforms both human test-takers and prior models, demonstrating a 26% increase over ChatGPT and beating humans in five of seven subject areas. On the MEE and MPT, which have not previously been evaluated by scholars, GPT-4 scores an average
... (truncated, 12 KB total)e7e2f9d13842946b | Stable ID: ZjUwYjNmMG