Publication: Measuring Massive Multitask Language Understanding (MMLU) by Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt (2020-09)
1 evidence check
Last checked: 4/3/2026
All key fields in the record are confirmed by the source text. The title, all seven authors in the correct order, and the publication date (2020-09, corresponding to arxiv submission September 2020) are explicitly stated. The arxiv URL format matches the standard pattern for arxiv papers. The publication type as 'paper' is appropriate for an arxiv preprint. No contradictions or discrepancies were found.
Evidence — 1 source, 1 check
Note: All key fields in the record are confirmed by the source text. The title, all seven authors in the correct order, and the publication date (2020-09, corresponding to arxiv submission September 2020) are explicitly stated. The arxiv URL format matches the standard pattern for arxiv papers. The publication type as 'paper' is appropriate for an arxiv preprint. No contradictions or discrepancies were found.
Debug info
Record type: publication
Record ID: QNbZISNZtJ