Apollo Research

Safety Organization

apolloresearch.ai ↗

Entity

Overview Wiki

About

People1 Divisions1

Business

Grants Received6

Data

Facts Database

Apollo Research is an AI safety research organization founded in 2023 with a specific focus on one of the most concerning potential failure modes: deceptive alignment and scheming behavior in advanced AI systems.

Facts

General

Websitehttps://www.apolloresearch.ai

View all facts in FactBase →

Other Data

Entity Assessments

7 entries

Dimension	Rating	Evidence	Assessor
government-integration	Strong	UK AISI partner, US AISI consortium member, presented at Bletchley AI Summit	editorial
intervention-impact	Measurable	Deliberative alignment reduced scheming from 13% to 0.4% (30x reduction) in OpenAI models	editorial
key-finding-2024	Critical	o1 maintains deception in over 85% of follow-up questions after engaging in scheming	editorial
lab-partnerships	Extensive	Pre-deployment evaluations for [OpenAI](https://openai.com/index/detecting-and-reducing-scheming-in-ai-models/), [Anthropic](https://www.anthropic.com), and [Google DeepMind](https://deepmind.google/blog/deepening-our-partnership-with-the-uk-ai-security-institute/)	editorial
methodology-rigor	Very High	300 rollouts per model/evaluation; statistically significant results (p less than 0.05)	editorial
research-output	High Impact	[December 2024 paper](https://arxiv.org/abs/2412.04984) tested 6 frontier models across 180+ scenarios; cited in OpenAI/Anthropic safety frameworks	editorial
team-size	~20 researchers	Full-time staff including CEO Marius Hobbhahn, named [TIME 100 AI 2025](https://time.com/collections/time100-ai-2025/7305864/marius-hobbhahn/)	editorial

Divisions

Apollo Evals

Team

AI safety evaluations focused on detecting deceptive and scheming behaviors in frontier models. Published influential research on in-context scheming in 2024.

Related Wiki Pages

Top Related Pages

Organization

METR

Model Evaluation and Threat Research conducts dangerous capability evaluations for frontier AI models, testing for autonomous replication, cybersec...

Organization

UK AI Safety Institute

The UK AI Safety Institute (renamed AI Security Institute in February 2025) is a government body with approximately 30+ technical staff and an annu...

Risk

Deceptive Alignment

Risk that AI systems appear aligned during training but pursue different goals when deployed, with expert probability estimates ranging 5-90% and g...

Approach

Scheming & Deception Detection

Research and evaluation methods for identifying when AI models engage in strategic deception—pretending to be aligned while secretly pursuing other...

Organization

US AI Safety Institute (now CAISI)

US government agency for AI safety research and standard-setting under NIST, established November 2023 with \$10M initial budget (FY2025 request of...

Apollo Research

Facts

Other Data

Divisions

Related Wiki Pages

Top Related Pages

METR

UK AI Safety Institute

Deceptive Alignment

Scheming & Deception Detection

US AI Safety Institute (now CAISI)

Approaches

Analysis

Policy

Risks

Concepts

Other

Organizations

Key Debates