Skip to content
Longterm Wiki
JL

Jan Leike

VP of Alignment Science at Anthropic; former co-lead of OpenAI Superalignment team; prominent advocate for AI safety resource allocation

Current Role
VP of Alignment Science
Organization
Anthropic

Expert Positions2 topics

TopicViewEstimateConfidenceDateSource
Current Approaches ScaleUncertain45%mediumJul 2023OpenAI Superalignment (2023)
Will We Get Adequate Warning?Concerned40%medium2023

Education

PhD in Machine Learning, Australian National University

From wiki articleRead full article →

Quick Assessment

DimensionAssessment
Primary RoleVP of Alignment Science at Anthropic (2024–present)
Key ContributionsCo-authored early RLHF research; led the Agent Alignment Team at Google DeepMind; co-led OpenAI's Superalignment team; developed Reward Modeling frameworks
Key Publications"Deep Reinforcement Learning from Human Preferences" (NeurIPS 2017); "Scalable agent alignment via reward modeling" (arXiv 2018); "AI Safety Gridworlds" (arXiv 2017); "Recursively Summarizing Books with Human Feedback" (arXiv 2021)
Career TrajectoryPhD, Australian National University (2016) → FHI postdoc (2016) → Senior Research Scientist, Google DeepMind (2016–2021) → Head of Alignment / Superalignment co-lead, OpenAI (January 2021 – May 2024) → Anthropic (2024–present)
Notable EventDeparted OpenAI on May 16, 2024; posted publicly on X about his stated reasons for leaving

Overview

Jan Leike is an AI alignment researcher who has held senior roles at Google DeepMind, OpenAI, and Anthropic. He completed a PhD in reinforcement learning theory at Australian National University in 2016 under the supervision of Marcus Hutter, and subsequently held a brief research fellowship at the Future of Humanity Institute. At DeepMind, he led the Agent Alignment Team and contributed to early RLHF research. He joined OpenAI in January 2021 to lead alignment research, and in July 2023 co-led the formation of the Superalignment team alongside Ilya Sutskever, with a stated goal of solving Superintelligence within four years. He departed OpenAI on May 16, 2024, posting a public thread on X explaining his stated reasons for leaving. He subsequently joined Anthropic, where he heads the Alignment Science team. TIME magazine listed him among the 100 most influential people in AI in both 2023 and 2024.

Links

Facts

5
People
Employed ByAnthropic
Role / TitleVP of Alignment Science
Biographical
EducationPhD in Machine Learning, Australian National University
Notable ForVP of Alignment Science at Anthropic; former co-lead of OpenAI Superalignment team; prominent advocate for AI safety resource allocation
Social Media@janleike