Apollo Research - AI Safety Evaluation Organization

web

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Apollo Research

Apollo Research is a key third-party evaluator in the AI safety ecosystem, providing independent assessments of frontier models for dangerous capabilities and advising policymakers; their work on scheming evaluations is directly relevant to deceptive alignment concerns.

Metadata

Importance: 62/100homepage

Summary

Apollo Research is an AI safety organization focused on evaluating frontier AI systems for dangerous capabilities, particularly 'scheming' behaviors where advanced AI covertly pursues misaligned objectives. They conduct LLM agent evaluations for strategic deception, evaluation awareness, and scheming, while also advising governments on AI governance frameworks.

Key Points

•Specializes in evaluating frontier AI for 'scheming' - covert pursuit of misaligned objectives by advanced AI systems
•Conducts LLM agent evaluations focused on strategic deception, evaluation awareness, and scheming detection
•Partners with major AI labs including OpenAI, Google DeepMind, Microsoft, and Amazon
•Supports governments and international organizations in developing technical AI governance regimes and evaluation standards
•Conducts fundamental research into emergence of scheming behaviors and potential mitigations

Cited by 15 pages

Page	Type	Quality
Large Language Models	Concept	62.0
AI Risk Cascade Pathways Model	Analysis	67.0
AI Risk Warning Signs Model	Analysis	70.0
Apollo Research	Organization	58.0
Survival and Flourishing Fund (SFF)	Organization	59.0
Capability Elicitation	Approach	91.0
AI Governance Coordination Technologies	Approach	91.0
Dangerous Capability Evaluations	Approach	64.0
Evals-Based Deployment Gates	Approach	66.0
AI Evaluations	Research Area	72.0
AI Evaluation	Approach	72.0
Third-Party Model Auditing	Approach	64.0
Sleeper Agent Detection	Approach	66.0
Technical AI Safety Research	Crux	66.0
AI Development Racing Dynamics	Risk	72.0

Cached Content Preview

HTTP 200Fetched Apr 7, 20263 KB

Apollo Research

Dedicated to improving our understanding of AI to mitigate its risks.

Our mission

AI systems will soon be integrated into large parts of the economy and our personal lives.

While this transformation may unlock substantial personal and societal benefits, there are also vast risks. We think some of the greatest risks stem from &#8220;scheming&#8221; AIs, i.e. advanced AI systems that covertly pursue misaligned objectives. Our goal is to understand and evaluate for the emergence of scheming well enough to prevent the possible harms that scheming AIs might cause.

About us

Apollo Research is focused on reducing risks from dangerous capabilities in advanced AI systems, especially scheming behaviors.

We design AI model evaluations and conduct technical research to better understand state-of-the-art AI models. Our governance team provides global policymakers with expert technical guidance.

What we do

Our research

Model Evaluations

We develop and run evaluations of frontier AI systems. Our expertise is in LLM agent evaluations for strategic deception, evaluation awareness and scheming. We also conduct fundamental research into the emergence of scheming and into potential mitigations.

Governance & Policy

We support governments and international organisations by developing technical AI governance regimes. Expertise in building a robust third-party evaluation ecosystem, effectively regulating frontier AI systems, and establishing standards and best practices.

Consultancy

Additionally, we provide consultancy services for building responsible AI development frameworks, designing research programs, ecosystem mapping and literature reviews, and more.

Our partners

Frontier labs, multinational companies, governments, and foundations partner with Apollo Research.

“One thing you might imagine is testing for deception for example, as a capability. You really don’t want that in the system because then you can’t rely on anything else it’s reporting. So that would be my number one emerging capability I think that would be good to test for.”
Demis Hassabis CEO Google DeepMind

Contact

Get in touch

For collaborations and other inquiries, please get in touch

Currently, we are looking for collaborators in the broader AI governance, policy, and strategy sphere, and for partnerships with leading AI developers for model evaluations.

Get in touch

Resource ID: 329d8c2e2532be3d | Stable ID: sid_DCU0fYxYRO