Skip to content
Longterm Wiki

AI Evaluations

Evaluationactive

Systematic testing and measurement of AI system capabilities, alignment, and safety properties.

Organizations
4
Grants
122
Total Funding
$111M
Risks Addressed
2
Cluster: Evaluation

Tags

evaluationssafety-researchtesting

Organizations4

OrganizationRole
Anthropicactive
Alignment Research Centeractive
Google DeepMindactive
OpenAIactive

GrantsTop 50 of 122

NameRecipientAmountFunderDate
OpenMined Foundation — Secure Enclaves for LLM EvaluationOpenMined$11MCoefficient Giving2025-06
UC Davis — Malaria Gene Drive Feasibility Analysis (Greg Lanzaro) (2021)University of California, Davis$10MCoefficient Giving2021-12
RAND — AI Evaluation and TestingRAND$10MCoefficient Giving2025-09
University of Oxford — Malaria Vaccine ManufacturingUniversity of Oxford$4.7MCoefficient Giving2024-12
UC Berkeley — Cyberoffense BenchmarkUniversity of California, Berkeley$3.4MCoefficient Giving2025-06
General Support of Alignment Research Center (Evals Team)ARC Evaluations$3.2MSurvival and Flourishing Fund2023-01
Center for Global Development — General Support (2019)Center for Global Development$3MCoefficient Giving2019-06
FutureHouse — Benchmarks for Biology Research and DevelopmentFutureHouse$2.9MCoefficient Giving2024-03
Stanford University — LLM Cybersecurity BenchmarkStanford University$2.9MCoefficient Giving2024-07
Harvard University — Antimalarial Bednet Development and EvaluationHarvard University$2.9MCoefficient Giving2024-05
African Union Development Agency — General Support (2020)African Union Development Agency$2.5MCoefficient Giving2020-04
New Partnership for Africa’s Development Planning and Coordinating Agency — General SupportNew Partnership for Africa’s Development$2.4MCoefficient Giving2017-04
IDinsight — Embedded GiveWell Team (2017)IDinsight$2.3MCoefficient Giving2017-05
Apollo Research — General SupportApollo Research$2.2MCoefficient Giving2024-05
IDinsight — Endline Evaluation of New Incentives RCTIDinsight$2.1MCoefficient Giving2019-08
New York University — LLM Cybersecurity BenchmarkNew York University$2.1MCoefficient Giving2024-07
Malaria Consortium — Monitoring and Evaluation of LLIN Distribution CampaignMalaria Consortium$2.1MCoefficient Giving2021-07
Center for Study of Science, Technology and Policy — Centre for Air Pollution Studies InitiativeCSTEP$2MCoefficient Giving2023-08
IDinsight — General SupportIDinsight$2MCoefficient Giving2016-06
University of Maryland — LLM Cybersecurity BenchmarkUniversity of Maryland$1.7MCoefficient Giving2024-09
Center for Open Science — LLM Research BenchmarkCenter for Open Science$1.7MCoefficient Giving2024-07
SeedAI — General SupportSeedAI$1.6MCoefficient Giving2025-03
UC Davis — Malaria Gene Drive Feasibility Analysis (Greg Lanzaro)University of California, Davis$1.5MCoefficient Giving2020-02
UC Berkeley — Research on Rapid COVID-19 Serology Testing (Lisa Barcellos and Eva Harris)University of California, Berkeley$1.3MCoefficient Giving2020-04
Texas Organizing Project — Criminal Justice Reform (2017)Texas Organizing Project$1.2MCoefficient Giving2017-03
Owain Evans Research Group — AI Evaluations ResearchEffective Ventures Foundation USA$1.2MCoefficient Giving2023-05
Malaria Consortium — Monitoring and Evaluation of Net Distribution in Anambra, NigeriaMalaria Consortium$1.1MCoefficient Giving2021-12
Algorithmic Research Group — Language Model Capabilities Benchmarking (2024)Algorithmic Research Group$1.1MCoefficient Giving2024-08
Princeton University — Software Engineering LLM BenchmarkPrinceton University$1MCoefficient Giving2024-05
gui2de — Zusha! Road Safety Campaign (February 2017)Georgetown University$900KCoefficient Giving2017-02
Stanford University — LLM-Generated Research Ideation BenchmarkStanford University$880KCoefficient Giving2024-05
Princeton University — AI R&D BenchmarkPrinceton University$863KCoefficient Giving2024-09
Friedrich Schiller University Jena — Analytical Chemistry BenchmarkFriedrich Schiller University Jena$829KCoefficient Giving2024-10
Evidence Action — Impact Evaluation of Iron and Folic Acid Supplementation (“Phase 2”)Evidence Action$800KCoefficient Giving2019-03
University of Illinois Foundation — LLM Hacking BenchmarksUniversity of Illinois Urbana-Champaign$800KCoefficient Giving2024-01
Trustees of Boston University — LLM Research BenchmarkBoston University$756KCoefficient Giving2024-07
University of California, Berkeley — Software Engineering BenchmarkUniversity of California, Berkeley$740KCoefficient Giving2024-08
Daniel Kang — Research on AI BenchmarksDaniel Kang$680KCoefficient Giving2024-12
Abdul Latif Jameel Poverty Action Lab — Innovation and Science ResearchAbdul Latif Jameel Poverty Action Lab$649KCoefficient Giving2022-08
University of Oxford — LLM Research ReplicationUniversity of Oxford$622KCoefficient Giving2024-09
University of Illinois Urbana-Champaign — Zero-knowledge Proofs for Secure AI AuditsUniversity of Illinois Urbana-Champaign$615KCoefficient Giving2025-02
FutureSearch – Benchmark for Language Model ForecastingFutureSearch$607KCoefficient Giving2024-03
Yale University — LLM Persuasiveness EvaluationYale University$596KCoefficient Giving2024-06
Grant to Model Evaluation & Threat Research (METR)METR$548KSurvival and Flourishing Fund2025
Carnegie Mellon University — Benchmark for Web-Based TasksCarnegie Mellon University$547KCoefficient Giving2024-03
WestExec — Report on Assurance in Machine Learning SystemsWestExec$540KCoefficient Giving2020-02
Precision Development – Trial ScopingPrecision Development$540KCoefficient Giving2022-04
Metaculus — Forecasting TournamentsMetaculus$532KCoefficient Giving2024-05
iGEM — Synthetic Biology Safety and Security (2016)International Genetically Engineered Machine Foundation$520KCoefficient Giving2016-05
Cambridge in America — Data Science BenchmarkCambridge in America$518KCoefficient Giving2024-07

Funding by Funder

Sub-Areas12

NameStatusOrgsPapers
Alignment EvaluationsEvaluations specifically designed to measure alignment properties: honesty, helpfulness, harmlessness, and value adherence.active40
Alignment FakingResearch on whether and how AI systems might pretend to be aligned during evaluation while pursuing different goals at deployment.emerging41
Backdoor DetectionDetecting adversarially implanted vulnerabilities in model weights.active40
Capability ElicitationMethods for discovering hidden or latent capabilities in AI systems.active40
Control EvaluationsStress-testing systems designed to constrain AI behavior; monitoring for collusion.emerging40
Dangerous Capability EvaluationsTesting AI systems specifically for dangerous capabilities like CBRN knowledge, cyber offense, autonomous replication, and persuasion.active40
Evaluation AwarenessStudying how AI systems might game evaluations by detecting when they are being tested.active40
Jailbreak ResearchResearch on prompt injection, jailbreaking attacks, and defenses for language model safety filters.active40
Red TeamingAdversarial testing of AI systems to discover failure modes, safety issues, and vulnerabilities, both manual and automated.active70
Reward Hacking of Human OversightEmpirically investigating how AI systems deceive or manipulate human evaluators.emerging40
Scheming DetectionResearch on detecting when AI systems are engaged in deceptive alignment or strategic manipulation of their training process.emerging40
Sleeper Agent DetectionResearch on detecting and mitigating backdoors, trojans, and time-delayed deceptive behavior in AI systems.emerging41