Skip to content
Longterm Wiki

MathVista

Multimodal
A benchmark for mathematical reasoning in visual contexts — combines visual understanding with mathematical problem-solving across geometry, charts, scientific figures, and more.
Models Tested
4
Best Score
73.9%
Median Score
70.4%
Scoring: accuracy
Introduced: 2023-10
Maintainer: UCLA / Microsoft Research

Leaderboard (4 models)

#ModelDeveloperScore
🥇o1OpenAI
73.9%
🥈GPT-4.1 miniOpenAI
73.1
🥉Claude 3.5 SonnetAnthropic
67.7%
4GPT-4oOpenAI
63.8%