Conjecture is a 30-40 person London-based AI safety org founded 2021, pursuing Cognitive Emulation (CoEm) - building interpretable AI from ground-up rather than aligning LLMs - with $30M+ Series A funding. Founded by Connor Leahy (EleutherAI), they face high uncertainty about CoEm competitiveness (3-5 year timeline) and commercial pressure risks.
Conjecture
Conjecture
Conjecture is a 30-40 person London-based AI safety org founded 2021, pursuing Cognitive Emulation (CoEm) - building interpretable AI from ground-up rather than aligning LLMs - with $30M+ Series A funding. Founded by Connor Leahy (EleutherAI), they face high uncertainty about CoEm competitiveness (3-5 year timeline) and commercial pressure risks.
Conjecture
Conjecture is a 30-40 person London-based AI safety org founded 2021, pursuing Cognitive Emulation (CoEm) - building interpretable AI from ground-up rather than aligning LLMs - with $30M+ Series A funding. Founded by Connor Leahy (EleutherAI), they face high uncertainty about CoEm competitiveness (3-5 year timeline) and commercial pressure risks.
Overview
Conjecture is an AI safety research organization founded in 2021 by Connor LeahyPersonConnor LeahyBiography of Connor Leahy, CEO of Conjecture AI safety company, who transitioned from co-founding EleutherAI (open-source LLMs) to focusing on interpretability-first alignment. He advocates for ver...Quality: 19/100 and a team of researchers concerned about existential risks from advanced AI. The organization pursues a distinctive technical approach centered on "Cognitive Emulation" (CoEm) - building interpretable AI systems based on human cognition principles rather than aligning existing large language modelsCapabilityLarge Language ModelsComprehensive analysis of LLM capabilities showing rapid progress from GPT-2 (1.5B parameters, 2019) to o3 (87.5% on ARC-AGI vs ~85% human baseline, 2024), with training costs growing 2.4x annually...Quality: 60/100.
Based in London with a team of 30-40 researchers, Conjecture raised over $10M in Series A funding in 2023. Their research agenda emphasizes mechanistic interpretability and understanding neural network internals, representing a fundamental alternative to mainstream prosaic alignment approachesArgumentWhy Alignment Might Be HardComprehensive synthesis of why AI alignment is fundamentally difficult, covering specification problems (value complexity, Goodhart's Law), inner alignment failures (mesa-optimization, deceptive al...Quality: 61/100 pursued by organizations like AnthropicOrganizationAnthropicComprehensive profile of Anthropic, founded in 2021 by seven former OpenAI researchers (Dario and Daniela Amodei, Chris Olah, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish) with early funding... and OpenAIOrganizationOpenAIComprehensive organizational profile of OpenAI documenting evolution from 2015 non-profit to commercial AGI developer, with detailed analysis of governance crisis, safety researcher exodus (75% of ....
| Aspect | Assessment | Evidence | Source |
|---|---|---|---|
| Technical Innovation | High | Novel CoEm research agenda | Conjecture Blogโ๐ webConjecture Blogcognitive-emulationcoeminterpretabilitySource โ |
| Funding Security | Strong | $30M+ Series A (2023) | TechCrunch Reportsโ๐ webโ โ โ โโTechCrunchTechCrunch Reportscognitive-emulationcoeminterpretabilityconstitutional-ai+1Source โ |
| Research Output | Moderate | Selective publication strategy | Research Publicationsโ๐ webResearch Publicationscognitive-emulationcoeminterpretabilitySource โ |
| Influence | Growing | European AI policy engagement | UK AISIโ๐๏ธ governmentโ โ โ โ โUK GovernmentUK AISIcapabilitythresholdrisk-assessmentgame-theory+1Source โ |
Risk Assessment
| Risk Category | Severity | Likelihood | Timeline | Trend |
|---|---|---|---|---|
| CoEm Uncompetitive | High | Moderate | 3-5 years | Uncertain |
| Commercial Pressure Compromise | Medium | High | 2-3 years | Worsening |
| Research Insularity | Low | Moderate | Ongoing | Stable |
| Funding Sustainability | Medium | Low | 5+ years | Improving |
Founding and Evolution
Origins (2021)
Conjecture emerged from the EleutherAI collective, an open-source AI research group that successfully recreated GPT-3 as open-source models (GPT-J, GPT-NeoX). Key founding factors:
| Factor | Impact | Details |
|---|---|---|
| EleutherAI Experience | High | Demonstrated capability replication feasibility |
| Safety Concerns | High | Recognition of risks from capability proliferationRiskAI ProliferationAI proliferation accelerated dramatically as the capability gap narrowed from 18 to 6 months (2022-2024), with open-source models like DeepSeek R1 now matching frontier performance. US export contr...Quality: 60/100 |
| European Gap | Medium | Limited AI safety ecosystem outside Bay Area |
| Funding Availability | Medium | Growing investor interest in AI safety |
Philosophical Evolution: The transition from EleutherAI's "democratize AI" mission to Conjecture's safety-focused approach represents a significant shift in thinking about AI development and publication strategies.
Funding Trajectory
| Year | Funding Stage | Amount | Impact |
|---|---|---|---|
| 2021 | Seed | Undisclosed | Initial team of โ15 researchers |
| 2023 | Series A | $30M+ | Scaled to 30-40 researchers |
| 2024 | Operating | Ongoing | Sustained research operations |
Cognitive Emulation (CoEm) Research Agenda
Core Philosophy
Conjecture's signature approach contrasts sharply with mainstream AI development:
| Approach | Philosophy | Methods | Evaluation |
|---|---|---|---|
| Prosaic Alignment | Train powerful LLMs, align post-hoc | RLHFCapabilityRLHFRLHF/Constitutional AI achieves 82-85% preference improvements and 40.8% adversarial attack reduction for current systems, but faces fundamental scalability limits: weak-to-strong supervision shows...Quality: 63/100, Constitutional AIApproachConstitutional AIConstitutional AI is Anthropic's methodology using explicit principles and AI-generated feedback (RLAIF) to train safer models, achieving 3-10x improvements in harmlessness while maintaining helpfu...Quality: 70/100 | Behavioral testing |
| Cognitive Emulation | Build interpretable systems from ground up | Human cognition principles | Mechanistic understanding |
Key Research Components
Mechanistic Interpretability
- Circuit discovery in neural networks
- Feature attribution and visualization
- Scaling interpretability to larger models
- Interpretability researchSafety AgendaInterpretabilityMechanistic interpretability has extracted 34M+ interpretable features from Claude 3 Sonnet with 90% automated labeling accuracy and demonstrated 75-85% success in causal validation, though less th...Quality: 66/100 collaboration
Architecture Design
- Modular systems for better control
- Interpretability-first design choices
- Trading capabilities for understanding
- Novel training methodologies
Model Organisms
- Smaller, interpretable test systems
- Alignment property verification
- Deception detection research
- Goal representation analysis
Key Personnel
Leadership Team
Connor Leahy Profile
| Aspect | Details |
|---|---|
| Background | EleutherAI collective member, GPT-J contributor |
| Evolution | From open-source advocacy to safety-focused research |
| Public Role | Active AI policy engagement, podcast appearances |
| Views | Short AI timelines, high P(doom), interpretability-necessary |
Timeline Estimates: Leahy has consistently expressed short AI timelineConceptAGI TimelineComprehensive synthesis of AGI timeline forecasts showing dramatic acceleration: expert median dropped from 2061 (2018) to 2047 (2023), Metaculus from 50 years to 5 years since 2020, with current p...Quality: 59/100 views, suggesting AGI within years rather than decades.
Research Focus Areas
Mechanistic Interpretability
| Research Area | Status | Key Questions |
|---|---|---|
| Circuit Analysis | Active | How do transformers implement reasoning? |
| Feature Extraction | Ongoing | What representations emerge in training? |
| Scaling Methods | Development | Can interpretability scale to AGI-level systems? |
| Goal Detection | Early | How can we detect goal-directedness mechanistically? |
Comparative Advantages
| Organization | Primary Focus | Interpretability Approach |
|---|---|---|
| Conjecture | CoEm, ground-up interpretability | Design-time interpretability |
| AnthropicOrganizationAnthropicComprehensive profile of Anthropic, founded in 2021 by seven former OpenAI researchers (Dario and Daniela Amodei, Chris Olah, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish) with early funding... | Frontier models + interpretability | Post-hoc analysis of LLMs |
| ARCOrganizationAlignment Research CenterComprehensive overview of ARC's dual structure (theory research on Eliciting Latent Knowledge problem and systematic dangerous capability evaluations of frontier AI models), documenting their high ...Quality: 43/100 | Theoretical alignment | Evaluation and ELK research |
| RedwoodOrganizationRedwood ResearchA nonprofit AI safety and security research organization founded in 2021, known for pioneering AI Control research, developing causal scrubbing interpretability methods, and conducting landmark ali...Quality: 78/100 | AI controlSafety AgendaAI ControlAI Control is a defensive safety approach that maintains control over potentially misaligned AI through monitoring, containment, and redundancy, offering 40-60% catastrophic risk reduction if align...Quality: 75/100 | Interpretability for control |
Strategic Position
Theory of Change
Conjecture's pathway to AI safety impact:
- Develop scalable interpretability techniques for powerful AI systems
- Demonstrate CoEm viability as competitive alternative to black-box scaling
- Influence field direction toward interpretability-first development
- Inform governance with technical feasibility insights
- Build safe systems using CoEm principles if successful
European AI Safety Hub
| Role | Impact | Examples |
|---|---|---|
| Geographic Diversity | High | Alternative to Bay Area concentration |
| Policy Engagement | Growing | UK AISIOrganizationUK AI Safety InstituteThe UK AI Safety Institute (renamed AI Security Institute in Feb 2025) operates with ~30 technical staff and 50M GBP annual budget, conducting frontier model evaluations using its open-source Inspe...Quality: 52/100 consultation |
| Talent Development | Moderate | European researcher recruitment |
| Community Building | Early | Workshops and collaborations |
Challenges and Criticisms
Technical Feasibility
| Challenge | Severity | Status |
|---|---|---|
| CoEm Competitiveness | High | Unresolved - early stage |
| Interpretability Scaling | High | Active research question |
| Human Cognition Complexity | Medium | Ongoing investigation |
| Timeline Alignment | High | Critical if AGI timelinesConceptAGI TimelineComprehensive synthesis of AGI timeline forecasts showing dramatic acceleration: expert median dropped from 2061 (2018) to 2047 (2023), Metaculus from 50 years to 5 years since 2020, with current p...Quality: 59/100 short |
Organizational Tensions
Commercial Pressure vs Safety Mission
- VC funding creates return expectations
- Potential future deployment pressure
- Comparison to Anthropic's commercialization pathOrganizationAnthropicComprehensive profile of Anthropic, founded in 2021 by seven former OpenAI researchers (Dario and Daniela Amodei, Chris Olah, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish) with early funding...
Publication Strategy Criticism
- Shift from EleutherAI's radical openness
- Selective research sharing decisions
- Balance between transparency and safety
Current Research Outputs
Published Work
| Type | Focus | Impact |
|---|---|---|
| Technical Papers | Interpretability methods | Research community |
| Blog Posts | CoEm explanations | Public understanding |
| Policy Contributions | Technical feasibility | Governance decisions |
| Open Source Tools | Interpretability software | Research ecosystem |
Research Questions
Key Questions
- ?Can CoEm produce AI systems competitive with scaled LLMs?
- ?Is mechanistic interpretability sufficient for AGI safety verification?
- ?How will commercial pressures affect Conjecture's research direction?
- ?What role should interpretability play in AI governance frameworks?
- ?Can cognitive emulation bridge neuroscience and AI safety research?
- ?How does CoEm relate to other alignment approaches like Constitutional AI?
Timeline and Risk Estimates
Leadership Risk Assessments
Conjecture's leadership has articulated clear views on AI timelines and safety approaches, which fundamentally motivate their Cognitive Emulation research agenda and organizational strategy:
| Expert/Source | Estimate | Reasoning |
|---|---|---|
| Connor Leahy | AGI: 2-10 years | Leahy has consistently expressed short AI timeline views across multiple public statements and podcasts from 2023-2024, suggesting transformative AI systems could emerge within years rather than decades. These short timelines create urgency for developing interpretability-first approaches before AGI arrives. |
| Connor Leahy | P(doom): High without major changes | Leahy has expressed significant concern about the default trajectory of AI development in 2023 statements, arguing that prosaic alignment approaches pursued by frontier labs are insufficient to ensure safety. This pessimism about conventional alignment motivates Conjecture's alternative CoEm approach. |
| Conjecture Research | Prosaic alignment: Insufficient | The organization's core research direction reflects a fundamental assessment that post-hoc alignment of large language models through techniques like RLHF and Constitutional AI cannot provide adequate safety guarantees. This view, maintained since founding, drives their pursuit of interpretability-first system design. |
| Organization | Interpretability: Necessary for safety | Conjecture's founding premise holds that mechanistic interpretability is not merely useful but necessary for AI safety verification. This fundamental research assumption distinguishes them from organizations pursuing behavioral safety approaches and shapes their entire technical agenda. |
Future Scenarios
Research Trajectory Projections
| Timeline | Optimistic | Realistic | Pessimistic |
|---|---|---|---|
| 2-3 years | CoEm demonstrations, policy influence | Continued interpretability advances | Commercial pressure compromises |
| 3-5 years | Competitive interpretable systems | Mixed results, partial success | Research agenda stagnates |
| 5+ years | Field adoption of CoEm principles | Portfolio contribution to safety | Marginalized approach |
Critical Dependencies
| Factor | Importance | Uncertainty |
|---|---|---|
| Technical Feasibility | Critical | High - unproven at scale |
| Funding Continuity | High | Medium - VC expectations |
| AGI Timeline | Critical | High - if very short, insufficient time |
| Field Receptivity | Medium | Medium - depends on results |
Relationships and Collaborations
Within AI Safety Ecosystem
| Organization | Relationship | Collaboration Type |
|---|---|---|
| AnthropicOrganizationAnthropicComprehensive profile of Anthropic, founded in 2021 by seven former OpenAI researchers (Dario and Daniela Amodei, Chris Olah, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish) with early funding... | Friendly competition | Interpretability research sharing |
| ARCOrganizationAlignment Research CenterComprehensive overview of ARC's dual structure (theory research on Eliciting Latent Knowledge problem and systematic dangerous capability evaluations of frontier AI models), documenting their high ...Quality: 43/100 | Complementary | Different technical approaches |
| MIRIOrganizationMachine Intelligence Research InstituteComprehensive organizational history documenting MIRI's trajectory from pioneering AI safety research (2000-2020) to policy advocacy after acknowledging research failure, with detailed financial da...Quality: 50/100 | Aligned concerns | Skepticism of prosaic alignment |
| Academic Labs | Collaborative | Interpretability technique development |
Policy and Governance
UK Engagement
- UK AI Safety InstituteOrganizationUK AI Safety InstituteThe UK AI Safety Institute (renamed AI Security Institute in Feb 2025) operates with ~30 technical staff and 50M GBP annual budget, conducting frontier model evaluations using its open-source Inspe...Quality: 52/100 consultation
- Technical feasibility assessments
- European AI Act discussions
International Influence
- Growing presence in global AI safety discussions
- Alternative perspective to US-dominated discourse
- Technical grounding for governance approaches
Sources & Resources
Primary Sources
| Type | Source | Description |
|---|---|---|
| Official Website | Conjecture.devโ๐ webConjecture Blogcognitive-emulationcoeminterpretabilitySource โ | Research updates, team information |
| Research Papers | Google Scholarโ๐ webโ โ โ โ โGoogle ScholarGoogle Scholarcognitive-emulationcoeminterpretabilitySource โ | Technical publications |
| Blog Posts | Conjecture Blogโ๐ webConjecture Blogcognitive-emulationcoeminterpretabilitySource โ | Research explanations, philosophy |
| Interviews | Connor Leahy Talksโ๐๏ธ talkConnor Leahy Talkscognitive-emulationcoeminterpretabilitySource โ | Leadership perspectives |
Secondary Analysis
| Type | Source | Focus |
|---|---|---|
| AI Safety Analysis | LessWrongOrganizationLessWrongLessWrong is a rationality-focused community blog founded in 2009 that has influenced AI safety discourse, receiving $5M+ in funding and serving as the origin point for ~31% of EA survey respondent...Quality: 44/100 Postsโโ๏ธ blogโ โ โ โโLessWrongLessWrong Postscognitive-emulationcoeminterpretabilitySource โ | Community discussion |
| Technical Reviews | Alignment Forumโโ๏ธ blogโ โ โ โโAlignment ForumAI Alignment Forumalignmenttalentfield-buildingcareer-transitions+1Source โ | Research evaluation |
| Policy Reports | GovAIOrganizationGovAIGovAI is an AI policy research organization with ~15-20 staff, funded primarily by Coefficient Giving ($1.8M+ in 2023-2024), that has trained 100+ governance researchers through fellowships and cur...Quality: 43/100 Analysisโ๐๏ธ governmentโ โ โ โ โCentre for the Governance of AIGovAIA research organization focused on understanding AI's societal impacts, governance challenges, and policy implications across various domains like workforce, infrastructure, and...governanceagenticplanninggoal-stability+1Source โ | Governance implications |
| Funding News | TechCrunch Coverageโ๐ webโ โ โ โโTechCrunchTechCrunch Coveragecognitive-emulationcoeminterpretabilitySource โ | Business developments |
Related Resources
| Topic | Internal Links | External Resources |
|---|---|---|
| Interpretability | Technical InterpretabilitySafety AgendaInterpretabilityMechanistic interpretability has extracted 34M+ interpretable features from Claude 3 Sonnet with 90% automated labeling accuracy and demonstrated 75-85% success in causal validation, though less th...Quality: 66/100 | Anthropic Interpretabilityโ๐ webโ โ โ โ โTransformer CircuitsMechanistic Interpretabilityinterpretabilitymesa-optimizationinner-alignmentlearned-optimization+1Source โ |
| Alignment Approaches | Why Alignment is HardArgumentWhy Alignment Might Be HardComprehensive synthesis of why AI alignment is fundamentally difficult, covering specification problems (value complexity, Goodhart's Law), inner alignment failures (mesa-optimization, deceptive al...Quality: 61/100 | AI AlignmentApproachAI AlignmentComprehensive review of AI alignment approaches finding current methods (RLHF, Constitutional AI) achieve 75-90% effectiveness on existing systems but face critical scalability challenges, with ove...Quality: 91/100 Forumโโ๏ธ blogโ โ โ โโAlignment ForumAI Alignment Forumalignmenttalentfield-buildingcareer-transitions+1Source โ |
| European AI Policy | UK AISIOrganizationUK AI Safety InstituteThe UK AI Safety Institute (renamed AI Security Institute in Feb 2025) operates with ~30 technical staff and 50M GBP annual budget, conducting frontier model evaluations using its open-source Inspe...Quality: 52/100 | EU AI Officeโ๐ webโ โ โ โ โEuropean Union**EU AI Office**risk-factorcompetitiongame-theorycascades+1Source โ |
| Related Orgs | Safety Organizations | AI Safety Communityโ๐ webAI Safety Communitysafetycognitive-emulationcoeminterpretability+1Source โ |