Center for AI Safety (CAIS)

Safety Organization

Founded 2022 (4 years old)HQ: San Franciscosafe.ai ↗

Also known as: CAIS

Entity

Overview Wiki

About

People13 Timeline6 Divisions6

Business

Funding Rounds6 Grants Received9

Data

Facts Database

The Center for AI Safety (CAIS) is a nonprofit organization that works to reduce societal-scale risks from AI. CAIS combines research, field-building, and public communication to advance AI safety. Co-founded by Dan Hendrycks (Executive Director) and Oliver Zhang (Managing Director) in 2022.

Revenue

$10M

as of 2024

Total Funding Raised

$33M

as of 2025

Annual Expenses

$7.2M

as of 2024

Net Assets

$12M

as of 2024

Key Metrics

Revenue (ARR)

$10M2024

Funding Rounds

$21Mtotal raised

Per round

Total

Facts

Financial

Grant Received$1.1M

Total Funding Raised$33M

Net Assets$12M

Annual Expenses$7.2M

Revenue$10M

General

Websitehttps://www.safe.ai/

Organization

HeadquartersSan Francisco

Founded Date2022

Other

Key PersonDan Hendrycks

CompensationDan Hendrycks takes $1 annual salary as Executive Director

PublicationThe WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning — benchmark for evaluating dual-use AI capabilities in biosecurity, cybersecurity, and chemical weapons

InfrastructureCompute cluster with 256 NVIDIA A100 GPUs available for AI safety researchers

ProgramML Safety Scholars — educational program training hundreds of students in AI safety fundamentals. Includes online course, reading groups, and mentorship.

Board MemberJaan Tallinn

CampaignStatement on AI Risk (May 2023): one-sentence statement 'Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.' Signed by 350+ AI leaders including Geoffery Hinton, Demis Hassabis, Sam Altman, and Dario Amodei.

View all facts in FactBase →

Other Data

Entity Events

6 entries

Title	Date	EventType	Description	Significance
Reported revenue of $10.2M (FY2024)	2024	milestone	Cumulative funding reaches ~$33M since founding ($6.7M in 2022, $16.1M in 2023, $10.2M in 2024).	moderate
Statement on AI Risk released	2023-05	milestone	One-sentence statement on AI extinction risk attracted signatures from over 350 AI researchers and industry figures, including Turing Award recipients (Hinton, Bengio, Russell) and CEOs of major AI labs (Altman, Amodei, Hassabis).	major
MACHIAVELLI benchmark released	2023	publication	Benchmark for evaluating goal-directed and deceptive behavior in AI systems.	moderate
Representation Engineering paper published	2023	publication	Methods for reading and steering model internal representations.	major
"Unsolved Problems in ML Safety" published	2022	publication	Taxonomy of open technical challenges in machine learning safety, intended partly as a research agenda for the field.	major
Founded by Dan Hendrycks and Oliver Zhang	2022	founding	Nonprofit research organization (EIN 88-1751310) focused on technical AI safety research, field-building, and public communication.	major

Publications

12 entries

Title	PublicationType	Authors	Url	PublishedDate	IsFlagship
Humanity's Last Exam	paper	Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li et al.	arxiv.org	2025-01	✓
Introduction to AI Safety, Ethics, and Society	book	Dan Hendrycks	aisafetybook.com	2024-06	✓
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning	paper	Nathaniel Li, Alexander Pan, Anjali Gopal et al.	wmdp.ai	2024	✓
Superintelligence Strategy	report	Dan Hendrycks, Eric Schmidt, Alexandr Wang	nationalsecurity.ai	2024	✓
Improving Alignment and Robustness with Circuit Breakers	paper	Andy Zou, Long Phan, Justin Wang et al.	arxiv.org	2024	—
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming	paper	Mantas Mazeika, Long Phan, Xuwang Yin et al.	harmbench.org	2024	✓
Representation Engineering: A Top-Down Approach to AI Transparency	paper	Andy Zou, Long Phan, Sarah Chen et al.	arxiv.org	2023-10	✓
An Overview of Catastrophic AI Risks	paper	Dan Hendrycks, Mantas Mazeika, Thomas Woodside	arxiv.org	2023-06	—
Statement on AI Risk	policy-brief	CAIS	aistatement.com	2023-05	✓
Universal and Transferable Adversarial Attacks on Aligned Language Models	paper	Andy Zou, Zifan Wang, Nicholas Carlini et al.	llm-attacks.org	2023	✓
Unsolved Problems in ML Safety	paper	Dan Hendrycks, Nicholas Carlini, John Schulman, Jacob Steinhardt	arxiv.org	2021-09	✓
Measuring Massive Multitask Language Understanding (MMLU)	paper	Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt	arxiv.org	2020-09	✓

Divisions

AI and Society Fellowship

Program

3-month SF program. $25K stipend. PhD/JD researchers.

CAIS Action Fund

Program·Varun Krovi

501(c)(4) advocacy arm. DC-based. Co-sponsored SB 1047. Lobbying ~$490K/yr.

CAIS Compute Cluster

Lab

80 A100 GPUs. 150+ researchers. ~100 safety papers, 16,000+ citations. Free access. Schmidt Sciences partnership.

Compute Cluster

Program

Provides free compute access to academic AI safety researchers. One of the largest non-industry compute resources available for safety research.

Field-Building

Program

Programs to grow the AI safety research community, including the Statement on AI Risk signed by hundreds of researchers and the ML Safety course.

Research

Team·Dan Hendrycks

Technical AI safety research on robustness, interpretability, and alignment. Led by Dan Hendrycks.

Center for AI Safety (CAIS)

Key Metrics

Revenue (ARR)

Funding Rounds

Facts

Other Data

Divisions

Related Wiki Pages

Top Related Pages

Representation Engineering

Power-Seeking AI

Existential Risk from AI

Dan Hendrycks

Pause Advocacy

Approaches

Analysis

Policy

Organizations

Other

Concepts

Key Debates

Risks

Historical