Fact·f_mGXpFffUh7·Fact

Center for AI Safety (CAIS) — publication: Measuring Massive Multitask Language Understanding (MMLU) — widely-used benchmark for evaluating LLM capabilities across 57 academic subjects

Verdictpartial85%

1 check · 5/18/2026

1 → partial

Our claim

entire record

Subject: Center for AI Safety (CAIS)
Value: Measuring Massive Multitask Language Understanding (MMLU) — widely-used benchmark for evaluating LLM capabilities across 57 academic subjects
As Of: January 2021
Source: https://arxiv.org/abs/2009.03300
Notes: Created by Hendrycks et al.; became one of the most-cited AI benchmarks

Source evidence

1 src · 1 check

arxiv.org/abs/2009.03300 resource

partial85%primaryHaiku 4.5 · 5/18/2026

NoteThe claim attributes MMLU to 'Center for AI Safety (CAIS)' as the publishing organization. The source paper lists author affiliations as UC Berkeley, Columbia University, UChicago, and UIUC — none of which is CAIS. While the paper confirms the benchmark name, the 57 subjects, and Hendrycks et al. as creators, it does not confirm CAIS as the institutional source. The source is about the MMLU benchmark itself, not about CAIS as an organization. This is a subject-identity mismatch: the claim attributes the work to CAIS, but the source shows it was authored by researchers at different institutions. The benchmark details are confirmed, but the organizational attribution is not supported by the source.

Case № f_mGXpFffUh7Filed 5/18/2026Confidence 85%