Quick Assessment
| Dimension | Assessment |
|---|---|
| Primary Role | CEO and Co-founder, Anthropic (2021–present) |
| Key Contributions | Developed Constitutional AI training methodology; created the Responsible Scaling Policy (RSP) framework with AI Safety Levels |
| Key Publications | Constitutional AI: Harmlessness from AI Feedback (2022); Training a Helpful and Harmless Assistant with RLHF (2022) |
| Institutional Affiliation | Anthropic |
| Influence on AI Safety | Advocates empirical alignment research on frontier models; RSP framework has influenced industry-wide safety policy adoption; Anthropic's mechanistic interpretability program is an active research contribution |
Overview
Dario Amodei is CEO and co-founder of Anthropic, an AI safety company developing Constitutional AI methods and related alignment techniques. His approach to AI development — sometimes described as a "competitive safety" strategy — holds that safety-focused organizations should compete at the frontier while implementing structured safety measures, on the grounds that ceding the frontier to less safety-conscious actors would produce worse outcomes. Amodei estimates a 10–25% probability of AI-caused catastrophe and expects transformative AI by 2026–2030, representing a middle position between pause advocates and accelerationists.
His approach emphasizes empirical alignment research on frontier models, responsible scaling policies, and Constitutional AI techniques. Under his leadership, Anthropic has raised substantial capital while maintaining a stated safety mission — offering one data point on the commercial viability of safety-focused AI development — and has advanced interpretability research through programs such as the Transformer Circuits project, as well as scalable oversight methods.