Skip to content
Longterm Wiki

MMMU

Multimodal
Massive Multi-discipline Multimodal Understanding — 11,500 questions from college exams across 30 subjects and 6 disciplines, requiring college-level reasoning over images.
Models Tested
6
Best Score
81.7%
Median Score
71.4%
Scoring: accuracy
Introduced: 2023-11

Leaderboard (6 models)

#ModelDeveloperScore
🥇Gemini 2.5 ProGoogle DeepMind
81.7%
🥈Claude Opus 4.6Anthropic
76.5
🥉Llama 4 MaverickMeta AI (FAIR)
73.4%
4Llama 4 ScoutMeta AI (FAIR)
69.4%
5Claude 3.7 SonnetAnthropic
69.1
6Claude 3 SonnetAnthropic
53.1