Longterm Wiki
Updated 2026-01-29HistoryData
Page StatusContent
Edited 2 weeks ago1.3k words2 backlinks
19
QualityStub
18
ImportancePeripheral
5
Structure5/15
101100%64%
Summary

Biographical overview of Dan Hendrycks, CAIS director who coordinated the May 2023 AI risk statement signed by major AI researchers. Covers his technical work on benchmarks (MMLU, ETHICS), robustness research, and institution-building efforts, emphasizing his focus on catastrophic AI risk as a global priority.

Dan Hendrycks

Person

Dan Hendrycks

Biographical overview of Dan Hendrycks, CAIS director who coordinated the May 2023 AI risk statement signed by major AI researchers. Covers his technical work on benchmarks (MMLU, ETHICS), robustness research, and institution-building efforts, emphasizing his focus on catastrophic AI risk as a global priority.

AffiliationCenter for AI Safety
RoleDirector
Known ForAI safety research, benchmark creation, CAIS leadership, catastrophic risk focus
Related
Organizations
Center for AI Safety
Policies
Compute Governance
People
Yoshua Bengio
1.3k words · 2 backlinks
Person

Dan Hendrycks

Biographical overview of Dan Hendrycks, CAIS director who coordinated the May 2023 AI risk statement signed by major AI researchers. Covers his technical work on benchmarks (MMLU, ETHICS), robustness research, and institution-building efforts, emphasizing his focus on catastrophic AI risk as a global priority.

AffiliationCenter for AI Safety
RoleDirector
Known ForAI safety research, benchmark creation, CAIS leadership, catastrophic risk focus
Related
Organizations
Center for AI Safety
Policies
Compute Governance
People
Yoshua Bengio
1.3k words · 2 backlinks

Background

Dan Hendrycks is the director of the Center for AI Safety (CAIS) and a prominent researcher focused on catastrophic and existential risks from AI. He has made significant contributions to both technical AI safety research and public awareness of AI risks.

Background:

  • PhD in Computer Science from UC Berkeley
  • Post-doc at UC Berkeley
  • Founded Center for AI Safety
  • Research on robustness, uncertainty, and safety

Hendrycks combines rigorous technical research with effective communication and institution-building to advance AI safety.

Major Contributions

Center for AI Safety (CAIS)

Founded CAIS as organization focused on:

  • Reducing catastrophic risks from AI
  • Technical safety research
  • Public awareness and advocacy
  • Connecting researchers and resources

Impact: CAIS has become major hub for AI safety work, coordinating research and advocacy.

Statement on AI Risk (May 2023)

Coordinated landmark statement: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war."

Signatories included:

  • Geoffrey Hinton
  • Yoshua Bengio
  • Sam Altman (OpenAI)
  • Demis Hassabis (DeepMind)
  • Dario Amodei (Anthropic)
  • Hundreds of AI researchers

Impact: Massively raised profile of AI existential risk, made it mainstream concern.

Technical Research

Significant contributions to:

AI Safety Benchmarks:

  • ETHICS dataset - evaluating moral reasoning
  • Hendrycks Test (MMLU) - measuring knowledge
  • Safety-specific evaluation methods
  • Adversarial robustness testing

Uncertainty and Robustness:

  • Out-of-distribution detection
  • Robustness to distribution shift
  • Calibration of neural networks
  • Anomaly detection

Natural Adversarial Examples:

  • Real-world failure modes
  • Testing model robustness
  • Understanding generalization limits

Research Philosophy

Focus on Catastrophic Risk

Hendrycks emphasizes:

  • Not just any AI safety issue
  • Specifically catastrophic/existential risks
  • High-stakes scenarios
  • Long-term implications

Empirical and Practical

Approach characterized by:

  • Concrete benchmarks and metrics
  • Testing on real systems
  • Measurable progress
  • Actionable results

Bridging Research and Policy

Works to:

  • Make research policy-relevant
  • Communicate findings clearly
  • Engage with policymakers
  • Translate technical work to action

Views on AI Risk

Dan Hendrycks' Risk Assessment

Dan Hendrycks has been explicit and consistent about the severity of catastrophic risks from AI, positioning them alongside society's most pressing existential threats. His actions—founding CAIS, coordinating the May 2023 AI risk statement signed by major AI researchers, and maintaining an active research program—demonstrate his belief that technical solutions are both necessary and achievable, though time is of the essence.

Expert/SourceEstimateReasoning
Catastrophic risk priorityOn par with pandemics and nuclear warHendrycks coordinated the May 2023 Statement on AI Risk which explicitly positioned extinction risk from AI as a global priority alongside pandemics and nuclear war. This framing was deliberate and endorsed by hundreds of leading AI researchers including Geoffrey Hinton, Yoshua Bengio, and the CEOs of major AI labs. The parallel to other existential risks signals that AI risk deserves similar institutional resources, research funding, and policy attention as these established threats.
Need for actionUrgentHendrycks founded the Center for AI Safety and coordinated the landmark 2023 statement specifically to accelerate action on catastrophic AI risks. His decision to focus CAIS explicitly on catastrophic and existential risks—rather than broader AI safety concerns—reflects his assessment that these high-stakes scenarios require immediate attention. The timing and prominence of the statement suggest he believes we are in a critical window where preventive measures can still be effective.
Technical tractabilityResearch can reduce riskCAIS maintains an active research program spanning technical safety research, compute governance, and ML safety education. This investment indicates Hendrycks' belief that concrete technical work—developing robustness measures, creating safety benchmarks, and training the next generation of safety researchers—can meaningfully reduce catastrophic risks. His focus on empirical methods and measurable progress suggests optimism that systematic research can address key problems before advanced AI systems are deployed.

Core Concerns

  1. Catastrophic risks are real: AI poses existential-level threats
  2. Need technical and governance solutions: Both required
  3. Current systems already show concerning behaviors: Problems visible now
  4. Rapid capability growth: Moving faster than safety work
  5. Coordination challenges: Individual labs can't solve alone

Strategic Approach

Multi-pronged:

  • Technical research on safety
  • Public awareness and advocacy
  • Policy engagement
  • Field building and coordination

Pragmatic:

  • Work with systems as they are
  • Focus on measurable improvements
  • Build coalitions
  • Incremental progress

CAIS Work

Research Programs

Technical Safety:

  • Robustness research
  • Evaluation methods
  • Alignment techniques
  • Empirical studies

Compute Governance:

  • Hardware-level safety measures
  • Compute tracking and allocation
  • International coordination
  • Supply chain interventions

ML Safety Course:

  • Educational curriculum
  • Training next generation
  • Making safety knowledge accessible
  • Academic integration

Advocacy and Communication

Statement on AI Risk:

  • Coordinated broad consensus
  • Brought issue to mainstream
  • Influenced policy discussions
  • Demonstrated unity in field

Public Communication:

  • Media appearances
  • Op-eds and articles
  • Talks and presentations
  • Social media engagement

Field Building

Connecting Researchers:

  • Workshops and conferences
  • Research collaborations
  • Funding opportunities
  • Community building

Key Publications

Safety Benchmarks

  • "ETHICS: Measuring Ethical Reasoning in Language Models" - Evaluating moral reasoning
  • "Measuring Massive Multitask Language Understanding" (MMLU) - Comprehensive knowledge benchmark
  • "Natural Adversarial Examples" - Real-world robustness testing

Technical Safety

  • "Unsolved Problems in ML Safety" - Research agenda
  • "Out-of-Distribution Detection" - Methods for identifying distribution shift
  • "Robustness research" - Multiple papers on making models more robust

Position Papers

  • "X-Risk Analysis for AI Research" - Framework for thinking about catastrophic risks
  • Contributions to policy discussions - Technical input for governance

Public Impact

Raising Awareness

The Statement on AI Risk:

  • Reached global media
  • Influenced policy discussions
  • Made x-risk mainstream
  • Built consensus among experts

Policy Influence

Hendrycks' work has influenced:

  • Congressional testimony and hearings
  • EU AI Act discussions
  • International coordination efforts
  • Industry standards

Academic Integration

CAIS has helped:

  • Make safety research academically respectable
  • Create curricula and courses
  • Train students in safety
  • Publish in top venues

Unique Contributions

Consensus Building

Exceptional at:

  • Bringing together diverse groups
  • Finding common ground
  • Building coalitions
  • Coordinating action

Communication

Skilled at:

  • Explaining technical concepts clearly
  • Reaching different audiences
  • Media engagement
  • Policy translation

Pragmatic Approach

Focuses on:

  • What can actually be done
  • Working with current systems
  • Measurable progress
  • Building bridges

Current Priorities at CAIS

  1. Technical safety research: Advancing robustness and alignment
  2. Compute governance: Hardware-level interventions
  3. Public awareness: Maintaining pressure on the issue
  4. Policy engagement: Influencing regulation and governance
  5. Field building: Growing the safety research community

Evolution of Focus

Early research:

  • Robustness and uncertainty
  • Benchmarks and evaluation
  • Academic ML research

Growing safety focus:

  • Increasingly concerned about risks
  • Founded CAIS
  • More explicit about catastrophic risks

Current:

  • Explicitly focused on x-risk
  • Leading advocacy efforts
  • Building coalitions
  • Policy engagement

Criticism and Challenges

Some argue:

  • Focus on catastrophic risk might neglect near-term harms
  • Statement was too brief/vague
  • Consensus might paper over important disagreements

Supporters argue:

  • X-risk deserves special focus
  • Brief statement was strategically effective
  • Consensus demonstrates seriousness of concern

Hendrycks' approach:

  • X-risk is priority but not only concern
  • Brief statement was feature, not bug
  • Diversity of views compatible with shared concern

Vision for the Field

Hendrycks envisions:

  • AI safety as central to AI development
  • Strong safety standards and regulations
  • International coordination on AI
  • Technical solutions to catastrophic risks
  • Safety research well-funded and respected

Related Pages

Top Related Pages

Approaches

Eval Saturation & The Evals GapAI-Human Hybrid Systems

Concepts

AnthropicOpenAISelf-Improvement and Recursive Enhancement

Organizations

US AI Safety InstituteUK AI Safety Institute

Risks

AI Distributional ShiftAI-Induced Irreversibility

Safety Research

AI EvaluationsAnthropic Core Views

Transition Model

AI CapabilitiesLab Behavior

Key Debates

Technical AI Safety ResearchAI Risk Critical Uncertainties Model

Models

Alignment Robustness Trajectory ModelCarlsmith's Six-Premise Argument

People

Geoffrey Hinton

Analysis

Short AI Timeline Policy ImplicationsOpenAI Foundation Governance Paradox