MathVista

Multimodal

A benchmark for mathematical reasoning in visual contexts — combines visual understanding with mathematical problem-solving across geometry, charts, scientific figures, and more.

Models Tested

Best Score

73.9%

Median Score

70.4%

Scoring: accuracy

Introduced: 2023-10

Maintainer: UCLA / Microsoft Research

Leaderboard (4 models)

#	Model	Developer	Score
🥇	o1	OpenAI	73.9%
🥈	GPT-4.1 mini	OpenAI	73.1
🥉	Claude 3.5 Sonnet	Anthropic	67.7%
4	GPT-4o	OpenAI	63.8%