Redwood Research

Safety Organization

Founded Jun 2021 (4 years old)HQ: San Francisco, CAredwoodresearch.org ↗

Also known as: Redwood

Entity

Overview Wiki

About

People3 Timeline7 Divisions1

Business

Grants Received6 Market Data8

Policy & Governance

Policy Positions1

Output & Research

Publications2 Announcements5

Data

Facts Database

A nonprofit AI safety and security research organization founded in 2021, known for pioneering AI Control research, developing causal scrubbing interpretability methods, and conducting landmark alignment faking studies with Anthropic.

Revenue

$22K

as of 2024

Headcount

as of 2023

Annual Expenses

$2.9M

as of 2024

Net Assets

$6.5M

as of 2024

Key Metrics

Revenue (ARR)

$22K2024

Headcount

342023

Facts

Financial

Revenue$22K

Net Assets$6.5M

Annual Expenses$2.9M

Headcount34

Other

Legal Identifier87-1702255 (EIN)

Organization

Legal StructureNonprofit research lab

HeadquartersSan Francisco, CA

Founded DateJun 2021

People

Founded ByNate Thomas, Buck Shlegeris

View all facts in FactBase →

Other Data

Entity Assessments

4 entries

Dimension	Rating	Evidence	Assessor
focus-area	AI systems acting against developer interests	Primary research on AI Control and alignment faking	editorial
funding	$25M+ from Coefficient Giving	$9.4M (2021), $10.7M (2022), $5.3M (2023)	editorial
key-concern	Research output relative to funding	2023 critics cited limited publications; subsequent ICML, NeurIPS work addressed this	editorial
team-size	10 staff (2021), 6-15 research staff (2023 estimate)	Early team of 10 expanded to research organization	editorial

Entity Events

7 entries

Title	Date	EventType	Description	Significance
Alignment faking paper with Anthropic	2024-12	publication	Landmark collaboration with Anthropic on alignment faking research.	major
Buck Shlegeris becomes CEO; AI Control ICML oral	2024	leadership-change	Buck Shlegeris transitions from CTO to CEO and Director; Ryan Greenblatt serves as Chief Scientist. AI Control work accepted as an ICML oral.	major
REMIX interpretability program runs	2023	launch	Mechanistic interpretability training program for ~10-15 junior researchers.	moderate
Adversarial robustness research project	2022	milestone	Initial adversarial training project; later acknowledged by leadership as unsuccessful.	minor
Causal scrubbing methodology developed	2022	publication	Developed across 2022-2023; method for rigorously testing mechanistic interpretability claims.	moderate
MLAB bootcamp launches	2021-12	launch	Inaugural ML for Alignment Bootcamp with 40 participants; 3-week intensive teaching attendees to build BERT/GPT-2 from scratch.	moderate
Tax-exempt status granted; 10 staff assembled	2021-09	founding	—	major

Divisions

Redwood Research

Team

Core research team working on interpretability and adversarial training techniques. Small org (~15-20 people) focused on applied alignment research including causal scrubbing and circuit-level interpretability.

Prediction Markets

8 active

49%

Will I still work on alignment research at Redwood Research in 5 years?

Manifold

53%

Will Redwood Research still exist in 5 years?

Manifold

60%

Will I still work on alignment research at Redwood Research in 3 years?

Manifold

68%

Will Redwood Research still exist in 3 years?

Manifold

83%

By 2028, will I think Redwood Research has been net-good for the world?

Manifold

15%

Will "The inaugural Redwood Research podcast" make the top fifty posts in LessWrong's 2026 Annual Review?

Manifold

Related Wiki Pages

Top Related Pages

Research Area

AI Control

A defensive safety approach maintaining control over potentially misaligned AI systems through monitoring, containment, and redundancy, offering 40...

Organization

Anthropic

An AI safety company founded by former OpenAI researchers that develops frontier AI models while pursuing safety research, including the Claude mod...

Crux

AI Alignment Research Agendas

Analysis of major AI safety research agendas comparing approaches from Anthropic (\$100M+ annual safety budget, 37-39% team growth), DeepMind (30-5...

Crux

Technical AI Safety Research

Technical AI safety research aims to make AI systems reliably safe through scientific and engineering work.

Risk

Scheming

AI scheming—strategic deception during training to pursue hidden goals—has demonstrated emergence in frontier models.

Redwood Research

Key Metrics

Revenue (ARR)

Headcount

Facts

Other Data

Divisions

Prediction Markets

Related Wiki Pages

Top Related Pages

AI Control

Anthropic

AI Alignment Research Agendas

Technical AI Safety Research

Scheming

Safety Research

Approaches

Analysis

Policy

Organizations

Risks

Other

Concepts

Key Debates