Skip to content
Longterm Wiki

Measuring Massive Multitask Language Understanding (MMLU)

publicationVerified

Metadata

Source Tablepublications
Source IDQNbZISNZtJ
DescriptionDan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt, 2020-09
Source URLarxiv.org/abs/2009.03300
ParentCenter for AI Safety (CAIS)
Children
CreatedMar 23, 2026, 2:46 PM
UpdatedMar 23, 2026, 2:46 PM
SyncedMar 23, 2026, 2:46 PM

Record Data

idQNbZISNZtJ
entityIdCenter for AI Safety (CAIS)(organization)
entityDisplayName
resourceId
titleMeasuring Massive Multitask Language Understanding (MMLU)
authorsDan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt
urlarxiv.org/abs/2009.03300
venue
publishedDate2020-09
publicationTypepaper
citationCount
isFlagshipYes
abstract
sourcearxiv.org/abs/2009.03300
notesMost widely used AI capability benchmark. ICLR 2021.

Source Check Verdicts

confirmed95% confidence

Last checked: 4/29/2026

1 → confirmed

Debug info

Thing ID: QNbZISNZtJ

Source Table: publications

Source ID: QNbZISNZtJ

Parent Thing ID: sid_y4bieqSeag