Longterm Wiki
Updated 2026-02-01HistoryData
Page StatusContent
Edited 12 days ago836 words2 backlinks
35
QualityDraft
20
ImportancePeripheral
9
Structure9/15
201203%38%
Updated every 6 weeksDue in 5 weeks
Summary

Comprehensive biographical profile of Eliezer Yudkowsky covering his foundational contributions to AI safety (CEV, early problem formulation, agent foundations) and notably pessimistic views (>90% p(doom)). Includes detailed 'Statements & Track Record' section analyzing his mixed prediction accuracy—noting early timeline errors, vindication on AI generalization in Hanson debate, and the unfalsifiability of his core doom predictions.

Issues2
QualityRated 35 but structure suggests 60 (underrated by 25 points)
Links1 link could use <R> components

Eliezer Yudkowsky

Person

Eliezer Yudkowsky

Comprehensive biographical profile of Eliezer Yudkowsky covering his foundational contributions to AI safety (CEV, early problem formulation, agent foundations) and notably pessimistic views (>90% p(doom)). Includes detailed 'Statements & Track Record' section analyzing his mixed prediction accuracy—noting early timeline errors, vindication on AI generalization in Hanson debate, and the unfalsifiability of his core doom predictions.

AffiliationMachine Intelligence Research Institute
RoleCo-founder & Research Fellow
Known ForEarly AI safety work, decision theory, rationalist community
Related
Organizations
Machine Intelligence Research Institute
Risks
Deceptive AlignmentSharp Left Turn
People
Paul Christiano
836 words · 2 backlinks

Key Links

SourceLink
Official Websiteyudkowsky.net
Wikipediaen.wikipedia.org
Wikidatawikidata.org
Person

Eliezer Yudkowsky

Comprehensive biographical profile of Eliezer Yudkowsky covering his foundational contributions to AI safety (CEV, early problem formulation, agent foundations) and notably pessimistic views (>90% p(doom)). Includes detailed 'Statements & Track Record' section analyzing his mixed prediction accuracy—noting early timeline errors, vindication on AI generalization in Hanson debate, and the unfalsifiability of his core doom predictions.

AffiliationMachine Intelligence Research Institute
RoleCo-founder & Research Fellow
Known ForEarly AI safety work, decision theory, rationalist community
Related
Organizations
Machine Intelligence Research Institute
Risks
Deceptive AlignmentSharp Left Turn
People
Paul Christiano
836 words · 2 backlinks

Background

Eliezer Yudkowsky is one of the earliest and most influential voices in AI existential risk. He co-founded the Machine Intelligence Research Institute (originally the Singularity Institute) in 2000, making him a pioneer in organized AI safety research.

Yudkowsky is largely self-taught in mathematics and computer science, beginning his AI safety work in the late 1990s. He's known for:

  • Founding LessWrong and the rationalist community
  • Writing extensively on cognitive biases and rational thinking
  • Developing early frameworks for AI alignment (Coherent Extrapolated Volition)
  • Contributing to decision theory (Timeless Decision Theory, Updateless Decision Theory)
  • Writing fiction exploring AI alignment themes (Harry Potter and the Methods of Rationality)

Key Contributions to AI Safety

Coherent Extrapolated Volition (CEV)

Proposed in 2004, CEV attempts to formalize "what humanity would want if we knew more, thought faster, were more the people we wished we were." Rather than trying to specify human values directly, CEV suggests extrapolating what we would collectively choose under idealized conditions.

Early Warning and Problem Formulation

Yudkowsky was among the first to:

  • Articulate the alignment problem clearly
  • Explain why superintelligent AI poses unique risks
  • Emphasize the difficulty of value specification
  • Highlight the potential for "treacherous turns" in AI development
  • Argue that alignment must be solved before AGI is developed

Agent Foundations Research

Through MIRI, Yudkowsky has pushed for research on fundamental questions about agency, decision theory, and embedded agents. This includes work on:

  • Logical uncertainty
  • Naturalized induction
  • Reflective consistency
  • Embedded agency

Views on Key Cruxes

Risk Assessment

P(doom): Very high, often stated as >90% in recent years

Timeline: Believes AGI is plausible within 10-20 years, possibly sooner

Alignment difficulty: Considers alignment extremely difficult, likely requiring fundamental theoretical breakthroughs we haven't made yet

Core Concerns

  1. Default outcome is doom: Without major breakthroughs in alignment theory, Yudkowsky believes AGI development will likely lead to human extinction
  2. Sharp left turn: Expects rapid capability gains that outpace our ability to align systems
  3. Deceptive alignment: Worried that sufficiently capable systems will learn to appear aligned during training while pursuing different goals
  4. Inadequate preparation: Believes current alignment efforts are insufficient for the difficulty of the problem

Disagreements with Mainstream

Yudkowsky is notably more pessimistic than most AI safety researchers:

AI Alignment Difficulty

Eliezer YudkowskyExtremely difficult, likely requires fundamental theoretical breakthroughs we haven't made

Sharp capability jumps, deceptive alignment, inner alignment hard

Confidence: high
Paul ChristianoDifficult but tractable with empirical iteration on prosaic alignment techniques

Can make incremental progress, learn from weaker systems

Confidence: medium
Dario AmodeiChallenging but solvable with responsible scaling and empirical research

Safety can keep pace with capabilities if we're careful

Confidence: medium

Strategic Views

On Current AI Development

Yudkowsky has advocated for:

  • Slowing down AI capabilities research: Believes we need much more time for alignment work
  • International cooperation: Has proposed international treaties to limit AGI development
  • Extreme measures: In a controversial 2023 Time article, suggested potential need for international enforcement including military action against rogue AGI projects

On Alignment Approaches

  • Skeptical of prosaic alignment: Doubtful that techniques like RLHF will scale to superintelligence
  • Emphasis on theory: Believes we need better theoretical foundations before scaling systems
  • Critical of "race to the top": Argues that building AGI to solve alignment is putting the cart before the horse

Key Publications and Writings

  • "Intelligence Explosion Microeconomics" (2013) - Analyzes economic dynamics of recursive self-improvement
  • "There's No Fire Alarm for Artificial General Intelligence" (2017) - Argues we won't get clear warning signs
  • "AGI Ruin: A List of Lethalities" (2022) - Comprehensive argument for why default outcomes are catastrophic
  • Sequences (2006-2009) - Blog posts on rationality, many touching on AI safety
  • "Pausing AI Developments Isn't Enough. We Need to Shut it All Down" (2023) - Controversial Time op-ed

Influence and Legacy

Yudkowsky's impact extends beyond direct research:

  1. Field creation: Helped establish AI safety as a legitimate field of study
  2. Community building: Created intellectual infrastructure (LessWrong, CFAR) that trained many current researchers
  3. Problem formulation: Articulated key problems that shaped decades of subsequent work
  4. Public awareness: Through writing and fiction, introduced AI risk to broader audiences
  5. Funding: His early work influenced major funders like Coefficient Giving

Criticism and Controversy

Yudkowsky is a polarizing figure:

Critics argue:

  • His extreme pessimism may be counterproductive or unfounded
  • Lack of formal credentials in relevant fields
  • Sometimes dismissive of others' approaches
  • Apocalyptic framing may alienate potential allies

Supporters counter:

  • He was correct about many things before others (importance of AI safety, difficulty of alignment)
  • Has demonstrated technical competence through decision theory work
  • Pessimism may be warranted given stakes
  • Direct communication style is valuable even if uncomfortable

Statements & Track Record

For a detailed analysis of Yudkowsky's predictions and their accuracy, see the full track record page.

Summary: Made significant errors when young (early timeline predictions); updated to timeline agnosticism; vindicated on AI generalization question in Hanson debate; core doom predictions remain unfalsifiable until AGI exists.

CategoryExamples
CorrectAI generalization with simple architectures (Hanson debate), AI safety becoming mainstream
WrongEarly timeline predictions (Singularity by 2021), deep learning skepticism timing
Pending/UnfalsifiableP(doom) ≈99%, discontinuous takeoff, deceptive alignment

Notable: His p(doom) has increased from ≈50% to ≈99% over time, even as AI safety gained mainstream attention. His core predictions about catastrophic AI risk are unfalsifiable until AGI exists.

Related Pages

Top Related Pages

Key Debates

AI Alignment Research AgendasTechnical AI Safety ResearchWhy Alignment Might Be HardWhy Alignment Might Be Easy

Concepts

Machine Intelligence Research InstituteRLHFReasoning and PlanningSituational Awareness

Historical

The MIRI Era

Models

Instrumental Convergence FrameworkDeceptive Alignment Decomposition Model

Approaches

AI AlignmentEvaluation Awareness

People

Nick Bostrom

Transition Model

Alignment ProgressMisalignment PotentialAlignment Robustness

Labs

Apollo ResearchSafe Superintelligence Inc.

Organizations

Alignment Research Center