GSM8K
MathGrade School Math 8K — a dataset of 8,500 linguistically diverse grade-school math word problems requiring multi-step reasoning with basic arithmetic operations.
Models Tested
2
Best Score
96.8%
Median Score
93.05%
Scoring: accuracy
Introduced: 2021-10
Maintainer: OpenAI
Leaderboard2 models
| # | Model | Developer | Score |
|---|---|---|---|
| 🥇 | Llama 3.1 | Meta AI (FAIR) | 96.8% |
| 🥈 | DeepSeek V3 | DeepSeek | 89.3% |