Index
Center for AI Safety (CAIS) — publication: Measuring Massive Multitask Language Understanding (MMLU) — widely-used benchmark for evaluating LLM capabilities across 57 academic subjects
Verdictpartial85%
1 check · 5/18/20261 → partial
Our claim
entire record- Subject
- Center for AI Safety (CAIS)
- Value
- Measuring Massive Multitask Language Understanding (MMLU) — widely-used benchmark for evaluating LLM capabilities across 57 academic subjects
- As Of
- January 2021
- Notes
- Created by Hendrycks et al.; became one of the most-cited AI benchmarks
Source evidence
1 src · 1 checkpartial85%primaryHaiku 4.5 · 5/18/2026
NoteThe claim attributes MMLU to 'Center for AI Safety (CAIS)' as the publishing organization. The source paper lists author affiliations as UC Berkeley, Columbia University, UChicago, and UIUC — none of which is CAIS. While the paper confirms the benchmark name, the 57 subjects, and Hendrycks et al. as creators, it does not confirm CAIS as the institutional source. The source is about the MMLU benchmark itself, not about CAIS as an organization. This is a subject-identity mismatch: the claim attributes the work to CAIS, but the source shows it was authored by researchers at different institutions. The benchmark details are confirmed, but the organizational attribution is not supported by the source.
Case № f_mGXpFffUh7Filed 5/18/2026Confidence 85%