Skip to content
Longterm Wiki

MGSM

Math

Multilingual Grade School Math — a benchmark of 250 grade-school math problems translated into 10 typologically diverse languages. Tests multilingual mathematical reasoning.

Models Tested
7
Best Score
92.4%
Median Score
91.6%
Scoring: accuracy
Introduced: 2022-10
Maintainer: Google Research

Leaderboard7 models

#ModelDeveloperScore
🥇Claude 3.7 SonnetAnthropic
92.4%
🥈Gemini 2.5 ProGoogle DeepMind
92.2%
🥉Claude 3.5 SonnetAnthropic
91.6%
4Llama 3.1Meta AI (FAIR)
91.6%
5GPT-4oOpenAI
90.5%
6Claude 3.5 HaikuAnthropic
85.6%
7Gemini 1.5 FlashGoogle DeepMind
82.6%