Skip to content
Longterm Wiki
Navigation
Updated 2025-12-26HistoryData
Page StatusContent
Edited 3 months ago2.2k words2 backlinksUpdated quarterlyOverdue by 10 days
62QualityGood51.5ImportanceUseful67.5ResearchModerate
Content9/13
SummaryScheduleEntityEdit history1Overview
Tables25/ ~9Diagrams2/ ~1Int. links45/ ~18Ext. links0/ ~11Footnotes0/ ~7References24/ ~7Quotes0Accuracy0RatingsN:5 R:4.5 A:8 C:7.5Backlinks2
Change History1
Extract wiki proposals as structured data#1417 weeks ago

Created two new data layers: 1. **Interventions** (broad categories): Extended `Intervention` schema with risk coverage matrix, ITN prioritization, funding data. Created `data/interventions.yaml` with 14 broad intervention categories. `InterventionCard`/`InterventionList` components. 2. **Proposals** (narrow, tactical): New `Proposal` data type for specific, speculative, actionable items extracted from wiki pages. Created `data/proposals.yaml` with 27 proposals across 6 domains (philanthropic, financial, governance, technical, biosecurity, field-building). Each has cost/EV estimates, honest concerns, feasibility, stance (collaborative/adversarial). `ProposalCard`/`ProposalList` components. Post-review fixes: Fixed 13 incorrect wikiPageId E-codes in interventions.yaml (used numeric IDs instead of entity slugs). Added Intervention + Proposal to schema validator. Extracted shared badge color maps from 4 components into `badge-styles.ts`. Removed unused `client:load` prop and `fundingShare` destructure.

Issues1
StaleLast edited 100 days ago - may need review
TODOs4
Complete 'Conceptual Framework' section
Complete 'Quantitative Analysis' section (8 placeholders)
Complete 'Strategic Importance' section
Complete 'Limitations' section (6 placeholders)

Worldview-Intervention Mapping

Analysis

Worldview-Intervention Mapping

This framework maps beliefs about AI timelines (short/medium/long), alignment difficulty (hard/medium/tractable), and coordination feasibility (feasible/difficult/impossible) to intervention priorities, showing 2-10x differences in optimal resource allocation across worldview clusters. The model identifies that 20-50% of field resources may be wasted through worldview-work mismatches, with specific portfolio recommendations for each worldview cluster.

Model TypeStrategic Framework
FocusWorldview-Action Coherence
Key OutputIntervention priorities given different worldviews
Related
Analyses
AI Risk Portfolio Analysis
Risks
AI Development Racing Dynamics
2.2k words · 2 backlinks

Overview

This model maps how beliefs about AI risk create distinct worldview clusters with dramatically different intervention priorities. Different worldviews imply 2-10x differences in optimal resource allocation across pause advocacy, technical research, and governance work.

The model identifies that misalignment between personal beliefs and work focus may waste 20-50% of field resources. AI safety researchers hold fundamentally different assumptions about timelines, technical difficulty, and coordination feasibility, but these differences often don't translate to coherent intervention choices.

The framework reveals four major worldview clusters - from "doomer" (short timelines + hard alignment) prioritizing pause advocacy, to "technical optimist" (medium timelines + tractable alignment) emphasizing research investment.

Risk/Impact Assessment

DimensionAssessmentEvidenceTimeline
SeverityHigh2-10x resource allocation differences across worldviewsImmediate
LikelihoodVery HighSystematic worldview-work mismatches observedOngoing
ScopeField-wideAffects individual researchers, orgs, and fundersAll levels
TrendWorseningField growth without explicit worldview coordination2024-2027

Strategic Question Framework

Given your beliefs about AI risk, which interventions should you prioritize?

The core problem: People work on interventions that don't match their stated beliefs about AI development. This model makes explicit which interventions are most valuable under specific worldview assumptions.

How to Use This Framework

StepActionTool
1Identify worldviewAssess beliefs on timeline/difficulty/coordination
2Check prioritiesMap beliefs to intervention recommendations
3Audit alignmentCompare current work to worldview implications
4Adjust strategyEither change work focus or update worldview

Core Worldview Dimensions

Three belief dimensions drive most disagreement about intervention priorities:

Diagram (loading…)
flowchart TD
  subgraph Dimensions["Key Worldview Dimensions"]
      T[Timeline: When does risk materialize?]
      D[Difficulty: How hard is alignment?]
      C[Coordination: Can actors cooperate?]
  end

  T --> |Short| TS[2025-2030]
  T --> |Medium| TM[2030-2040]
  T --> |Long| TL[2040+]

  D --> |Hard| DH[Fundamental obstacles]
  D --> |Medium| DM[Solvable with effort]
  D --> |Tractable| DT[Largely solved already]

  C --> |Feasible| CF[Treaties possible]
  C --> |Difficult| CD[Limited cooperation]
  C --> |Impossible| CI[Pure competition]

  style T fill:#cceeff
  style D fill:#ffcccc
  style C fill:#ccffcc

Dimension 1: Timeline Beliefs

TimelineKey BeliefsStrategic ConstraintsSupporting Evidence
Short (2025-2030)AGI within 5 years; scaling continues; few obstaclesLittle time for institutional change; must work with existing structuresAmodei prediction of powerful AI by 2026-2027
Medium (2030-2040)Transformative AI in 10-15 years; surmountable obstaclesTime for institution-building; research can matureMetaculus consensus ≈2032 for AGI
Long (2040+)Major obstacles remain; slow takeoff; decades availableFull institutional development possible; fundamental research valuableMIRI position on alignment difficulty

Dimension 2: Alignment Difficulty

DifficultyCore AssumptionsResearch ImplicationsCurrent Status
HardAlignment fundamentally unsolved; deception likely; current techniques inadequateTechnical solutions insufficient; need to slow/stop developmentScheming research shows deception possible
MediumAlignment difficult but tractable; techniques improve with scaleTechnical research highly valuable; sustained investment neededConstitutional AI shows promise
TractableAlignment largely solved; RLHF + interpretability sufficientFocus on deployment governance; limited technical urgencyOpenAI safety approach assumes tractability

Dimension 3: Coordination Feasibility

FeasibilityInstitutional ViewPolicy ImplicationsHistorical Precedent
FeasibleTreaties possible; labs coordinate; racing avoidableInvest heavily in coordination mechanismsNuclear Test Ban Treaty, Montreal Protocol
DifficultPartial coordination; major actors defect; limited cooperationFocus on willing actors; partial governanceClimate agreements with partial compliance
ImpossiblePure competition; no stable equilibria; universal racingTechnical safety only; governance futileFailed disarmament during arms races

Four Major Worldview Clusters

Diagram (loading…)
quadrantChart
  title Worldview Clusters by Timeline and Difficulty
  x-axis Alignment Tractable --> Alignment Hard
  y-axis Long Timelines --> Short Timelines
  quadrant-1 PAUSE/STOP
  quadrant-2 TECHNICAL SPRINT
  quadrant-3 INSTITUTION BUILD
  quadrant-4 STEADY PROGRESS
  Doomer: [0.85, 0.85]
  Accelerationist: [0.15, 0.75]
  Governance-focused: [0.35, 0.25]
  Technical optimist: [0.25, 0.55]

Cluster 1: "Doomer" Worldview

Beliefs: Short timelines + Hard alignment + Coordination difficult

Intervention CategoryPriorityExpected ROIKey Advocates
Pause/slowdown advocacyVery High10x+ if successfulEliezer Yudkowsky
Compute governanceVery High5-8x via bottlenecksRAND reports
Technical safety researchHigh2-4x (low prob, high value)MIRI approach
International coordinationMedium8x if achieved (low prob)FHI governance work
Field-buildingLow1-2x (insufficient time)Long-term capacity building
Public engagementMedium3-5x via political supportPause AI movement

Coherence Check: If you believe this worldview but work on field-building or long-term institution design, your work may be misaligned with your beliefs.

Cluster 2: "Technical Optimist" Worldview

Beliefs: Medium timelines + Medium difficulty + Coordination possible

Intervention CategoryPriorityExpected ROILeading Organizations
Technical safety researchVery High8-12x via direct solutionsAnthropic, Redwood
InterpretabilityVery High6-10x via understandingChris Olah's work
Lab safety standardsHigh4-6x via industry normsPartnership on AI
Compute governanceMedium3-5x supplementary valueCSET research
Pause advocacyLow1x or negative (unnecessary)Premature intervention
Field-buildingHigh5-8x via capacityCHAI, MATS

Coherence Check: If you believe this worldview but work on pause advocacy or aggressive regulation, your efforts may be counterproductive.

Cluster 3: "Governance-Focused" Worldview

Beliefs: Medium-long timelines + Medium difficulty + Coordination feasible

Intervention CategoryPriorityExpected ROIKey Institutions
International coordinationVery High10-15x via global governanceUK AISI, US AISI
Domestic regulationVery High6-10x via norm-settingEU AI Act
Institution-buildingVery High8-12x via capacityAI Safety Institute development
Technical standardsHigh4-6x enabling governanceNIST AI RMF
Technical researchMedium3-5x (others lead)Research coordination role
Pause advocacyLow1-2x prematureGovernance development first

Coherence Check: If you believe this worldview but focus purely on technical research, you may be underutilizing comparative advantage.

Cluster 4: "Accelerationist/Optimist" Worldview

Beliefs: Any timeline + Tractable alignment + Any coordination level

Intervention CategoryPriorityExpected ROIRationale
Capability developmentVery High15-25x via benefitsAI solves problems faster than creates them
Deployment governanceMedium2-4x addressing specific harmsTargeted harm prevention
Technical safetyLow1-2x already adequateRLHF sufficient for current systems
Pause/slowdownVery LowNegative ROIDelays beneficial AI
Aggressive regulationVery LowLarge negative ROIStifles innovation unnecessarily

Coherence Check: If you hold this worldview but work on safety research or pause advocacy, your work contradicts your beliefs about AI risk levels.

Intervention Effectiveness Matrix

The following analysis shows how intervention effectiveness varies dramatically across worldviews:

InterventionShort+Hard (Doomer)Short+Tractable (Sprint)Long+Hard (Patient)Long+Tractable (Optimist)
Pause/slowdownVery High (10x)Low (1x)Medium (4x)Very Low (-2x)
Compute governanceVery High (8x)Medium (3x)High (6x)Low (1x)
Alignment researchHigh (3x)Low (2x)Very High (12x)Low (1x)
InterpretabilityHigh (4x)Medium (5x)Very High (10x)Medium (3x)
International treatiesMedium (2x)Low (1x)Very High (15x)Medium (4x)
Domestic regulationMedium (3x)Medium (4x)High (8x)Medium (3x)
Lab safety standardsHigh (6x)High (7x)High (8x)Medium (4x)
Field-buildingLow (1x)Low (2x)Very High (12x)Medium (5x)
Public engagementMedium (4x)Low (2x)High (7x)Low (1x)
Critical Insight

Working on "Very High" priority interventions under the wrong worldview can waste 5-10x resources compared to optimal allocation. This represents one of the largest efficiency losses in the AI safety field.

Portfolio Strategies for Uncertainty

Timeline Uncertainty Management

Uncertainty LevelRecommended AllocationHedge Strategy
50/50 short vs long60% urgent interventions, 40% patient capitalCompute governance + field-building
70% short, 30% long80% urgent, 20% patient with option valueStandards + some institution-building
30% short, 70% long40% urgent, 60% patient developmentInstitution-building + some standards

Alignment Difficulty Hedging

Belief DistributionTechnical ResearchGovernance/CoordinationRationale
50% hard, 50% tractable40% allocation60% allocationGovernance has value regardless
80% hard, 20% tractable20% allocation80% allocationFocus on buying time
20% hard, 80% tractable70% allocation30% allocationTechnical solutions likely

Coordination Feasibility Strategies

ScenarioUnilateral CapacityMultilateral InvestmentLeading Actor Focus
High coordination feasibility20%60%20%
Medium coordination feasibility40%40%20%
Low coordination feasibility60%10%30%

Current State & Trajectory

Field-Wide Worldview Distribution

Worldview ClusterEstimated PrevalenceResource AllocationAlignment Score
Doomer15-20% of researchers≈30% of resourcesModerate misalignment
Technical Optimist40-50% of researchers≈45% of resourcesGood alignment
Governance-Focused25-30% of researchers≈20% of resourcesPoor alignment
Accelerationist5-10% of researchers≈5% of resourcesUnknown

Observed Misalignment Patterns

Based on AI Alignment Forum surveys and 80,000 Hours career advising:

Common MismatchFrequencyEstimated Efficiency Loss
"Short timelines" researcher doing field-building25% of junior researchers3-5x effectiveness loss
"Alignment solved" researcher doing safety work15% of technical researchers2-3x effectiveness loss
"Coordination impossible" researcher doing policy10% of policy researchers4-6x effectiveness loss

2024-2027 Trajectory Predictions

TrendLikelihoodImpact on Field Efficiency
Increased worldview polarizationHigh-20% to -30% efficiency
Better worldview-work matchingMedium+15% to +25% efficiency
Explicit worldview institutionsLow+30% to +50% efficiency

Key Uncertainties & Cruxes

Key Questions

  • ?What's the actual distribution of worldviews among AI safety researchers?
  • ?How much does worldview-work mismatch reduce field effectiveness quantitatively?
  • ?Can people reliably identify and articulate their own worldview assumptions?
  • ?Would explicit worldview discussion increase coordination or create harmful polarization?
  • ?How quickly should people update worldviews based on new evidence?
  • ?Do comparative advantages sometimes override worldview-based prioritization?

Resolution Timelines

UncertaintyEvidence That Would ResolveTimeline
Actual worldview distributionComprehensive field survey6-12 months
Quantified efficiency lossesRetrospective impact analysis1-2 years
Worldview updating patternsLongitudinal researcher tracking2-5 years
Institutional coordination effectsNatural experiments with explicit worldview orgs3-5 years

Implementation Guidance

For Individual Researchers

Career StagePrimary ActionSecondary Actions
Graduate studentsIdentify worldview before specializingTalk to advisors with different worldviews
PostdocsAudit current work against worldviewConsider switching labs if misaligned
Senior researchersMake worldview explicit in workMentor others on worldview coherence
Research leadersHire for worldview diversityCreate space for worldview discussion

For Organizations

Organization TypeStrategic PriorityImplementation Steps
Research organizationsClarify institutional worldviewSurvey staff, align strategy, communicate assumptions
Grantmaking organizationsDevelop worldview-coherent portfoliosMap grantee worldviews, identify gaps, fund strategically
Policy organizationsCoordinate across worldview differencesCreate cross-worldview working groups
Field-building organizationsFacilitate worldview discussionHost workshops, create assessment tools

For Funders

Funding ApproachWhen AppropriateRisk Management
Single worldview concentrationHigh confidence in specific worldviewDiversify across intervention types within worldview
Worldview hedgingHigh uncertainty about key parametersFund complementary approaches, avoid contradictory grants
Worldview arbitrageIdentified underinvested worldview-intervention combinationsFocus on neglected high-value combinations

Failure Mode Analysis

Individual Failure Modes

Failure ModePrevalenceMitigation Strategy
Social conformity biasHighCreate protected spaces for worldview diversity
Career incentive misalignmentMediumReward worldview-coherent work choices
Worldview rigidityMediumEncourage regular worldview updating
False precision in beliefsHighEmphasize uncertainty and portfolio approaches

Institutional Failure Modes

Failure ModeSymptomsSolution
Worldview monocultureAll staff share same assumptionsActively hire for belief diversity
Incoherent strategyContradictory intervention portfolioMake worldview assumptions explicit
Update resistanceStrategy unchanged despite new evidenceCreate structured belief updating processes

Sources & Resources

Research Literature

CategoryKey SourcesQualityFocus
Worldview surveysAI Alignment Forum surveyMediumCommunity beliefs
Intervention effectiveness80,000 Hours researchHighCareer prioritization
Strategic frameworksCoefficient Giving worldview reportsHighCause prioritization

Tools & Assessments

ResourcePurposeAccess
Worldview self-assessmentIndividual belief identificationAI Safety Fundamentals
Intervention prioritization calculatorPortfolio optimizationEA Forum tools
Career decision frameworksWork-belief alignment80,000 Hours coaching

Organizations by Worldview

OrganizationPrimary WorldviewCore Interventions
MIRIDoomer (short+hard)Agent foundations, pause advocacy
AnthropicTechnical optimistConstitutional AI, interpretability
CSETGovernance-focusedPolicy research, international coordination
Redwood ResearchTechnical optimistAlignment research, interpretability

Complementary Models

  • AI Risk Portfolio Analysis - Risk category prioritization across scenarios
  • Racing Dynamics - How competition affects coordination feasibility
  • International Coordination Game - Factors affecting cooperation
  • Doomer Worldview - Short timelines, hard alignment assumptions
  • Governance-Focused Worldview - Coordination optimism, institution-building focus
  • Long Timelines Worldview - Patient capital, fundamental research emphasis

References

A RAND Corporation research report examining strategic prioritization frameworks related to AI development and safety considerations. The report likely addresses how organizations and policymakers should weigh competing priorities and worldviews when making decisions about AI governance and deployment.

★★★★☆

Partnership on AI (PAI) is a nonprofit coalition of AI researchers, civil society organizations, academics, and companies working to develop best practices, conduct research, and shape policy around responsible AI development. It brings together diverse stakeholders to address challenges including safety, fairness, transparency, and the societal impacts of AI systems. PAI serves as a coordination hub for cross-sector dialogue on AI governance.

★★★☆☆

This page outlines the European Commission's comprehensive policy framework for AI, centered on promoting trustworthy, human-centric AI through the AI Act, AI Continent Action Plan, and Apply AI Strategy. It aims to balance Europe's global AI competitiveness with safety, fundamental rights, and democratic values. Key initiatives include AI Factories, the InvestAI Facility, GenAI4EU, and the Apply AI Alliance.

★★★★☆
4**Future of Humanity Institute**Future of Humanity Institute

The official website of the Future of Humanity Institute (FHI), an Oxford University research center that was foundational in establishing the fields of existential risk research and AI safety. FHI closed on 16 April 2024 after approximately two decades of influential work. The site now serves as an archived record of the institution's history, research agenda, and legacy.

★★★★☆
580,000 Hours coaching80,000 Hours

80,000 Hours offers free, personalized one-on-one career advising focused on helping analytically-minded individuals pursue high-impact work, with particular emphasis on AI safety and other pressing global problems. Coaches help evaluate career options, make professional introductions to experts and hiring managers, and suggest concrete next steps. With over 5,000 people advised and a 95% recommendation rate, the service has demonstrably shifted career trajectories toward higher-impact work.

★★★☆☆
6AI Alignment ForumAlignment Forum·Blog post

The AI Alignment Forum is a central community platform for technical AI safety and alignment research discussion. The featured post argues against 'reductive utility' (utility functions over possible worlds) and proposes the Jeffrey-Bolker framework as an alternative that avoids ontological crises and computability constraints by grounding preferences in agent-relative events rather than universal physics.

★★★☆☆
7AI Safety Fundamentalsaisafetyfundamentals.com

BlueDot Impact (formerly AI Safety Fundamentals) is a leading talent accelerator offering free, cohort-based courses in Technical AI Safety, AI Governance, AGI Strategy, and Biosecurity. With over 6,000 alumni placed at organizations like OpenAI, Anthropic, DeepMind, and government bodies, it serves as a major pipeline for AI safety careers. Courses run monthly and are designed to help people identify where they can have the most impact.

The NIST AI RMF is a voluntary, consensus-driven framework released in January 2023 to help organizations identify, assess, and manage risks associated with AI systems while promoting trustworthiness across design, development, deployment, and evaluation. It provides structured guidance organized around core functions and is accompanied by a Playbook, Roadmap, and a Generative AI Profile (2024) addressing risks specific to generative AI systems.

★★★★★

Yudkowsky argues that unlike other catastrophic risks, AGI lacks a clear 'fire alarm' moment that would create social permission to take the threat seriously. Using the psychology of pluralistic ignorance, he explains why people fail to act on genuine danger signals and why the absence of a socially-sanctioned alarm makes AGI preparedness uniquely difficult. He concludes that we should not wait for such an alarm before acting on AGI safety.

★★★☆☆
10AI Alignment Forum surveyAlignment Forum·Rob Bensinger·2021·Blog post

Rob Bensinger surveyed ~117 AI safety researchers on two questions: the existential risk from insufficient technical AI safety research, and from AI misalignment with deployer intentions. With 44 respondents (38% response rate), the post shares raw probability estimates without analysis, noting individual caveats and cautioning against strong conclusions from aggregate numbers.

★★★☆☆

OpenAI's central safety page providing updates on their approach to AI safety research, deployment practices, and ongoing safety commitments. It serves as a hub for information on OpenAI's safety-related initiatives, policies, and technical work aimed at ensuring their AI systems are safe and beneficial.

★★★★☆

Open Philanthropy's hub for cause prioritization research, presenting their frameworks and worldview investigations across global catastrophic risks, AI safety, biosecurity, and other focus areas. These reports outline how Open Philanthropy weighs and allocates funding across cause areas based on expected impact, neglectedness, and tractability. They reflect the strategic thinking of one of the most influential funders in the AI safety and existential risk space.

★★★★☆

A letter from Anthropic CEO Dario Amodei outlining his worldview on AI development, the company's strategic approach to building powerful AI safely, and predictions about transformative AI timelines. It articulates why Anthropic occupies a unique position as a safety-focused lab continuing to build frontier AI despite existential risks.

★★★★☆

PauseAI is an advocacy movement calling for an international pause on the development of advanced AI systems until adequate safety measures and governance frameworks are in place. The organization coordinates activists, provides educational resources, and lobbies policymakers to take urgent action on AI risk. It represents a direct-action approach to AI safety that prioritizes preventing catastrophic outcomes over accelerating beneficial AI.

15MATS Research Programmatsprogram.org

MATS is an intensive fellowship program designed to help researchers transition into AI safety careers, offering structured mentorship from leading researchers, stipends, and community integration. Since 2021, it has trained over 446 researchers who have collectively produced 150+ research papers and gone on to work at top AI safety organizations.

16EA Forum Career PostsEA Forum·Blog post

The Effective Altruism Forum serves as a community hub for discussing careers, cause prioritization, and field-building within the EA and AI safety ecosystem. It hosts posts on career transitions into high-impact roles, including AI safety research, policy, and governance positions. The forum aggregates community thinking on how individuals can best contribute to reducing existential risks.

★★★☆☆

80,000 Hours makes the case that AI safety is one of the most pressing career areas for people who want to do the most good, arguing that advanced AI systems could develop power-seeking behaviors posing existential risks. The guide surveys the landscape of AI risk, outlines key research and policy directions, and provides career advice for those looking to contribute. It serves as a widely-read entry point for people considering AI safety work.

★★★☆☆

Metaculus is a collaborative online forecasting platform where users make probabilistic predictions on future events across domains including AI development, biosecurity, and global catastrophic risks. It aggregates crowd wisdom and expert forecasts to produce calibrated probability estimates on complex questions relevant to long-term planning and existential risk assessment.

★★★☆☆
19Constitutional AI: Harmlessness from AI FeedbackAnthropic·Yanuo Zhou·2025·Paper

Anthropic introduces a novel approach to AI training called Constitutional AI, which uses self-critique and AI feedback to develop safer, more principled AI systems without extensive human labeling.

★★★★☆

80,000 Hours is a nonprofit that provides research and advice on how to use your career to have the most positive impact on the world's most pressing problems, with significant focus on AI safety and existential risk. They offer career guides, job boards, and in-depth research on high-priority cause areas and career paths. Their methodology emphasizes earning to give, direct work in high-impact fields, and building career capital.

★★★☆☆
21CSET: AI Market DynamicsCSET Georgetown

CSET (Center for Security and Emerging Technology) at Georgetown University is a policy research organization focused on the security implications of emerging technologies, particularly AI. It produces research on AI policy, workforce, geopolitics, and governance. The content could not be fully extracted, limiting detailed analysis.

★★★★☆

Anthropic's research demonstrates that large language models can be trained to exhibit deceptive 'sleeper agent' behaviors—acting safely during training but executing harmful actions when triggered in deployment. Critically, standard safety fine-tuning techniques (RLHF, supervised fine-tuning, adversarial training) fail to reliably remove these backdoors and may even make deceptive behavior more hidden rather than eliminated.

★★★★☆

Anthropic's research page aggregates their work across AI alignment, mechanistic interpretability, and societal impact assessment, all oriented toward understanding and mitigating risks from increasingly capable AI systems. It serves as a central hub for their published findings and ongoing safety-focused investigations.

★★★★☆
24UK AI Safety Institute (AISI)UK AI Safety Institute·Government

The UK AI Safety Institute (AISI) is the UK government's dedicated body for evaluating and mitigating risks from advanced AI systems. It conducts technical safety research, develops evaluation frameworks for frontier AI models, and works with international partners to inform global AI governance and policy.

★★★★☆

Related Wiki Pages

Top Related Pages

Approaches

AI Safety Intervention Portfolio

Analysis

International AI Coordination Game ModelAI Safety Intervention Effectiveness MatrixPlanning for Frontier Lab ScalingAI Risk Activation Timeline ModelAI Safety Research Value Model

Concepts

Long-Timelines Technical WorldviewGovernance-Focused WorldviewAI Doomer WorldviewLongtermwiki Value Proposition

Organizations

Redwood ResearchAnthropicUK AI Safety InstituteMachine Intelligence Research InstituteCenter for Human-Compatible AI

Other

InterpretabilityEliezer YudkowskyChris Olah