Page StatusContent

Edited 2 weeks ago1.3k words2 backlinks

Summary

Biography of Connor Leahy, CEO of Conjecture AI safety company, who transitioned from co-founding EleutherAI (open-source LLMs) to focusing on interpretability-first alignment. He advocates for very short AGI timelines (2-5 years) and high existential risk, emphasizing mechanistic understanding over empirical tinkering.

Connor Leahy

Person

Connor Leahy

AffiliationConjecture

RoleCEO & Co-founder

Known ForFounding Conjecture, AI safety advocacy, interpretability research

Websiteconjecture.dev

Safety Agendas

People

1.3k words · 2 backlinks

Person

Connor Leahy

AffiliationConjecture

RoleCEO & Co-founder

Known ForFounding Conjecture, AI safety advocacy, interpretability research

Websiteconjecture.dev

Safety Agendas

People

1.3k words · 2 backlinks

Background

Connor Leahy is the CEO and co-founder of Conjecture, an AI safety company focused on interpretability and "prosaic" approaches to AGI alignment. He represents a new generation of AI safety researchers who are building organizations specifically to tackle alignment.

Background:

Largely self-taught in AI and machine learning
Co-founder of EleutherAI (open-source AI research collective)
Founded Conjecture in 2022
Active public communicator on AI risk

Leahy's journey from open-source AI contributor to safety company founder reflects growing concern about AI risks among those building the technology.

From EleutherAI to Conjecture

EleutherAI

Co-founded EleutherAI, which:

Created GPT-Neo and GPT-J (open-source language models)
Demonstrated capabilities research outside major labs
Showed small teams could train large models
Made AI research more accessible

The shift: Working on capabilities research convinced Leahy that AI risk was severe and urgent.

Why Conjecture?

Founded Conjecture because:

Believed prosaic AGI was coming soon
Thought existing safety work insufficient
Wanted to work on alignment with urgency
Needed independent organization focused solely on safety

Conjecture's Approach

Mission

Conjecture aims to:

Understand how AI systems work (interpretability)
Build safely aligned AI systems
Prevent catastrophic outcomes from AGI
Work at frontier of capabilities to ensure safety relevance

Research Focus

Interpretability:

Understanding neural networks mechanistically
Automated interpretability methods
Scaling understanding to large models

Alignment:

Prosaic alignment techniques
Testing alignment on current systems
Building aligned systems from scratch

Capability evaluation:

Understanding what models can really do
Detecting dangerous capabilities early
Red-teaming and adversarial testing

Views on AI Risk

Risk Assessment Estimates

Connor Leahy's public statements and interviews reveal a notably urgent perspective on AI risk compared to many researchers in the field. He combines very short timelines with high existential risk estimates, arguing that the default trajectory leads to catastrophic outcomes without significant changes to current approaches. His position emphasizes the need for immediate technical work on alignment rather than relying on slower governance interventions.

Assessment	Estimate	Reasoning
AGI timeline	Could be 2-5 years (2023)	Leahy believes AGI could arrive much sooner than mainstream estimates, pointing to rapid capability gains in language models and fewer remaining barriers than most researchers assume. His direct work on capabilities at EleutherAI gave him firsthand experience with how quickly scaling can produce surprising jumps in performance, making him skeptical of longer timeline projections.
P(doom)	High without major changes (2023)	Leahy expresses very high concern about default outcomes if alignment research doesn't advance dramatically. He argues that current prosaic approaches to AI development naturally lead to misaligned systems, and that existing safety techniques are fundamentally insufficient for systems approaching AGI capabilities. His transition from capabilities work to founding a safety company reflects deep worry about the baseline trajectory.
Urgency	Extreme (2024)	Leahy emphasizes the need for immediate action on alignment, arguing that the window for developing adequate safety measures is closing rapidly. He believes the field cannot afford to wait for theoretical breakthroughs or gradual governance changes, instead requiring urgent empirical work on interpretability and alignment with current systems to prepare for imminent advanced AI.

Core Beliefs

AGI is very near: Could be 2-10 years, possibly sooner
Default outcome is bad: Without major changes, things go poorly
Prosaic alignment is crucial: Need to align systems similar to current ones
Interpretability is essential: Can't align what we don't understand
Need to move fast: Limited time before dangerous capabilities emerge

On Timelines

Leahy is notably more pessimistic about timelines than most:

Believes AGI could be very close
Points to rapid capability gains
Sees fewer barriers than many assume
Emphasizes uncertainty but leans short

Strategic Position

Different from slowdown advocates:

Doesn't think we'll successfully slow down
Believes we need solutions that work in fast-moving world
Focuses on technical alignment over governance alone

Different from race-to-the-top:

Very concerned about safety
Skeptical of "building AGI to solve alignment"
Wants fundamental understanding first

Public Communication

Vocal AI Safety Advocate

Leahy is very active in public discourse:

Regular podcast appearances
Social media presence (Twitter/X)
Interviews and talks
Blog posts and essays

Key Messages

On urgency:

AGI could arrive much sooner than people think
We're not prepared
Need to take this seriously now

On capabilities:

Current systems are more capable than commonly believed
Emergent capabilities make prediction hard
Safety must account for rapid jumps

On solutions:

Need mechanistic understanding
Can't rely on empirical tinkering alone
Interpretability is make-or-break

Communication Style

Known for:

Direct, sometimes blunt language
Willingness to express unpopular views
Engaging in debates
Not mincing words about risks

Research Philosophy

Interpretability First

Believes:

Can't safely deploy what we don't understand
Black-box approaches fundamentally insufficient
Need to open the black box before scaling further
Interpretability isn't optional

Prosaic Focus

Working on:

Systems similar to current architectures
Alignment techniques that work today
Scaling understanding to larger models
Not waiting for theoretical breakthroughs

Empirical Approach

Emphasizes:

Testing ideas on real systems
Learning from current models
Rapid iteration
Building working systems

Conjecture's Work

Research Areas

Automated Interpretability:

Using AI to help understand AI
Scaling interpretability techniques
Finding circuits and features automatically

Capability Evaluation:

Understanding what models can do
Red-teaming frontier systems
Developing evaluation frameworks

Alignment Testing:

Empirical evaluation of alignment techniques
Stress-testing proposed solutions
Finding failure modes

Public Output

Conjecture has:

Published research on interpretability
Released tools for safety research
Engaged in public discourse
Contributed to alignment community

Influence and Impact

Raising Urgency

Leahy's advocacy has:

Brought attention to short timelines
Emphasized severity of risk
Recruited people to safety work
Influenced discourse on urgency

Building Alternative Model

Conjecture demonstrates:

Can build safety-focused company
Don't need to be at frontier labs
Independent safety research viable
Multiple organizational models possible

Community Engagement

Active in:

Alignment research community
Public communication about AI risk
Mentoring and advising
Connecting researchers

Criticism and Debates

Critics argue:

May be too pessimistic about timelines
Some statements are inflammatory
Conjecture's approach might not scale
Public communication sometimes counterproductive

Supporters argue:

Better to be cautious about timelines
Direct communication is valuable
Conjecture doing important work
Field needs diverse voices

Leahy's position:

Prefers to be wrong about urgency than complacent
Believes directness is necessary
Open to criticism and debate
Focused on solving problem

Evolution of Views

EleutherAI era:

Focused on democratizing AI
Excited about capabilities
Less concerned about risk

Transition:

Growing concern from working with models
Seeing rapid capability gains
Understanding alignment difficulty

Current:

Very concerned about risk
Focused entirely on safety
Urgent timeline beliefs
Public advocacy

Current Priorities

At Conjecture:

Interpretability research: Understanding how models work
Capability evaluation: Knowing what's possible
Alignment testing: Validating proposed solutions
Public communication: Raising awareness
Team building: Growing safety research capacity

Key Insights

From Building Capabilities to Safety

Leahy's experience building language models convinced him:

Capabilities can surprise
Scaling works better than expected
Safety is harder than it looks
Need fundamental understanding

On the Field

Observations about AI safety:

Not enough urgency
Too much theorizing, not enough empirical work
Need more attempts at solutions
Can't wait for perfect understanding

Connor Leahy

Connor Leahy

Connor Leahy

Background

From EleutherAI to Conjecture

EleutherAI

Why Conjecture?

Conjecture's Approach

Mission

Research Focus

Views on AI Risk

Risk Assessment Estimates

Core Beliefs

On Timelines

Strategic Position

Public Communication

Vocal AI Safety Advocate

Key Messages

Communication Style

Research Philosophy

Interpretability First

Prosaic Focus

Empirical Approach

Conjecture's Work

Research Areas

Public Output

Influence and Impact

Raising Urgency

Building Alternative Model

Community Engagement

Criticism and Debates

Evolution of Views

Current Priorities

Key Insights

From Building Capabilities to Safety

On the Field

Related Pages

Top Related Pages

Interpretability

ControlAI

Conjecture

Chris Olah

Neel Nanda

Risks

Concepts

Approaches

Safety Research

Policy

Labs

Organizations

Key Debates

Analysis

Historical

Transition Model