Longterm Wiki
Updated 2026-02-12HistoryData
Page StatusContent
Edited 1 day ago3.5k words
50
QualityAdequate
87
ImportanceHigh
15
Structure15/15
1429012%8%
Updated quarterlyDue in 13 weeks
Summary

Quantitative framework for evaluating how changes to AI development speed affect existential risk and long-term value. Models the marginal impact of acceleration/deceleration on P(existential catastrophe), safety readiness, governance preparedness, and conditional future value. Finds that 1 year of additional preparation time reduces x-risk by 1-4 percentage points depending on current readiness, but also delays economic and scientific benefits worth 0.1-0.5% of future value annually.

Issues2
QualityRated 50 but structure suggests 100 (underrated by 50 points)
Links9 links could use <R> components

AI Acceleration Tradeoff Model

Model

AI Acceleration Tradeoff Model

Quantitative framework for evaluating how changes to AI development speed affect existential risk and long-term value. Models the marginal impact of acceleration/deceleration on P(existential catastrophe), safety readiness, governance preparedness, and conditional future value. Finds that 1 year of additional preparation time reduces x-risk by 1-4 percentage points depending on current readiness, but also delays economic and scientific benefits worth 0.1-0.5% of future value annually.

Model TypeCost-Benefit Analysis
ScopeAI Timeline Acceleration/Deceleration
Key InsightThe value of acceleration vs. deceleration depends critically on current safety readiness level
Related
Models
Safety-Capability Tradeoff ModelCapability-Alignment Race ModelIntervention Timing Windows
Risks
AI Development Racing Dynamics
3.5k words
Model

AI Acceleration Tradeoff Model

Quantitative framework for evaluating how changes to AI development speed affect existential risk and long-term value. Models the marginal impact of acceleration/deceleration on P(existential catastrophe), safety readiness, governance preparedness, and conditional future value. Finds that 1 year of additional preparation time reduces x-risk by 1-4 percentage points depending on current readiness, but also delays economic and scientific benefits worth 0.1-0.5% of future value annually.

Model TypeCost-Benefit Analysis
ScopeAI Timeline Acceleration/Deceleration
Key InsightThe value of acceleration vs. deceleration depends critically on current safety readiness level
Related
Models
Safety-Capability Tradeoff ModelCapability-Alignment Race ModelIntervention Timing Windows
Risks
AI Development Racing Dynamics
3.5k words

Overview

Many AI safety interventions, governance proposals, and capabilities advances can be analyzed through a common lens: how much do they speed up or slow down the arrival of transformative AI (TAI), and what are the consequences of that time shift? This model provides a quantitative framework for making those comparisons.

The core claim is that the value of any action affecting AI timelines can be decomposed into its effects on three quantities:

  1. P(existential catastrophe) — How does the time shift change the probability of permanent human disempowerment or extinction?
  2. Conditional future value — How does the time shift change the expected value of the future, assuming we survive the transition?
  3. Transition costs — What are the direct costs of the time shift itself (delayed benefits, economic disruption, or racing incentives)?

This decomposition makes it possible to compare seemingly incommensurable actions — a safety research breakthrough, a capabilities speedup, or a regulatory slowdown — on the same scale.

Central framing: At current readiness levels, pulling TAI forward by 1 year is estimated to increase the probability of existential catastrophe by 1-4 percentage points, while producing ambiguous effects on the conditional value of the long-term future (earlier benefits vs. less mature governance). The net effect depends critically on how prepared we are when TAI arrives.

Connection to the AI Transition Model

This model operationalizes the AI Transition Model's causal framework by collapsing its many root factors into a single dimension: time available for preparation. The AI Transition Model identifies root factors like safety-capability gap, racing intensity, and civilizational competence. Each of these factors evolves over time. Accelerating TAI arrival means less time for safety research, governance development, and institutional adaptation — widening the gap between capability and readiness.

The simplification is deliberate: while the full causal model has dozens of parameters, many real-world decisions reduce to "does this make TAI come sooner or later, and by how much?"

The Core Model

Definitions

VariableSymbolDescription
TAI arrival timeTTYear when transformative AI is deployed
Baseline x-riskR0R_0P(existential catastrophe) at current trajectory
Time shiftΔT\Delta TChange in TAI arrival time (positive = delay, negative = acceleration)
Safety readinessS(t)S(t)Safety research maturity as a function of time (0 to 1)
Governance readinessG(t)G(t)Governance and institutional preparedness (0 to 1)
Conditional valueV(T)V(T)Expected value of the future given survival
Transition costC(ΔT)C(\Delta T)Direct costs of the time shift (delayed benefits, economic disruption, racing incentives)

Readiness scale anchors: SS and GG are normalized to [0, 1] where the endpoints represent:

ScoreSafety readiness SSGovernance readiness GG
0.0No alignment research existsNo AI-specific governance institutions
0.25Basic techniques exist (RLHF, evals) but not validated for TAI-level systemsPreliminary regulations exist (EU AI Act) but not designed for TAI risks
0.50Scalable oversight and interpretability work for current frontier models; untested at TAI levelMajor jurisdictions have enforceable TAI-specific frameworks; international coordination is functional
0.75Alignment techniques empirically validated on near-TAI systems; known failure modes are coveredGlobal governance regime with monitoring, enforcement, and incident response capacity
1.0High confidence that alignment generalizes to TAI; formal or empirical guaranteesMature international regime comparable to nuclear governance; tested through incidents

These anchors are inherently subjective. The key insight is not the precise numbers but the shape of the relationship: marginal returns to preparation are highest when readiness is low.

The Value Equation

The expected value of a time shift ΔT\Delta T can be expressed as:

EV(ΔT)=ΔVrisk+ΔVconditionalC(ΔT)EV(\Delta T) = \Delta V_{risk} + \Delta V_{conditional} - C(\Delta T)

Where:

  • ΔVrisk\Delta V_{risk} = value gained from reduced existential risk (or lost from increased risk)
  • ΔVconditional\Delta V_{conditional} = change in the value of the future conditional on survival
  • C(ΔT)C(\Delta T) = direct costs of the time shift

Risk as a Function of Preparation Time

The probability of existential catastrophe depends on the gap between capability and readiness at the moment of TAI deployment:

R(T)=Rbasef(Capability(T)Safety(T)Governance(T))R(T) = R_{base} \cdot f\left(\frac{Capability(T)}{Safety(T) \cdot Governance(T)}\right)

A simplifying assumption: capability at the TAI threshold is roughly fixed regardless of arrival date. In practice, earlier TAI may differ qualitatively — different architectures, capability profiles, or failure modes — which this model does not capture (see Limitations). Under this simplification, the key determinant of risk is how much time safety and governance have had to prepare. More time generally means lower risk, but with diminishing returns as readiness approaches sufficiency.

Marginal Risk per Unit of Acceleration

The key quantity for decision-making is the marginal change in x-risk per unit of acceleration:

RT(StwS+GtwG)risk sensitivity\frac{\partial R}{\partial T} \approx -\left(\frac{\partial S}{\partial t} \cdot w_S + \frac{\partial G}{\partial t} \cdot w_G\right) \cdot \text{risk sensitivity}

This derivative is negative (more time reduces risk) but its magnitude varies enormously depending on where we are in the preparation curve.

Parameter Estimates

Current State Assessment

ParameterCurrent Estimate90% CISource
Baseline P(x-catastrophe from AI)10-25%3-50%Metaculus, expert surveys
Safety readiness SS0.15-0.300.05-0.50Based on interpretability coverage, alignment techniques maturity
Governance readiness GG0.10-0.250.05-0.40Based on regulatory frameworks, international coordination
Safety improvement rate S/t\partial S / \partial t3-8 pp/year1-15 pp/yearHistorical progress in interpretability, RLHF
Governance improvement rate G/t\partial G / \partial t2-5 pp/year1-10 pp/yearEU AI Act pace, international treaty formation rate
TAI arrival (median estimate)2030-20402027-2060Metaculus aggregated forecasts

Marginal Risk Estimates by Readiness Level

The value of additional preparation time depends heavily on current readiness. When readiness is very low, each additional year of preparation is extremely valuable. When readiness is near-sufficient, additional time provides diminishing returns.

Readiness LevelSafety SSGovernance GGRisk Reduction per Year of DelayConfidence
Very low (current)0.150.152-4 ppMedium
Low0.300.251.5-3 ppMedium
Moderate0.500.400.5-2 ppLow
High0.700.600.1-0.5 ppLow
Near-sufficient0.850.80<0.1 ppVery Low

Conditional Future Value Effects

Acceleration does not only affect risk — it also affects the value of the future conditional on surviving the transition. These effects are more ambiguous:

Effect of Earlier TAIDirectionMagnitudeConfidence
Earlier access to scientific breakthroughsPositive+0.1-0.3% of future value per yearLow
Earlier solutions to ongoing catastrophes (climate, disease)Positive+0.05-0.2% per yearLow
Less time for pre-TAI coordination and norm-settingNegative-0.1-0.5% per yearMedium
Higher lock-in risk from less mature governanceNegative-0.2-1.0% per yearLow
Faster compounding of TAI-enabled economic growthPositive+0.1-0.5% per yearLow
Lost option value from less time to learn about alignment difficultyNegative-0.1-0.5% per yearMedium
Earlier AI-assisted safety research (TAI helps solve alignment)Positive+0.1-1.0% per yearVery Low

The net conditional value effect of acceleration is ambiguous and depends heavily on assumptions about how much governance maturity matters for post-TAI outcomes.

Two effects deserve special attention. Option value of delay: additional time before TAI lets us learn whether alignment is fundamentally hard or tractable, whether specific governance approaches work, and what failure modes actually manifest in increasingly capable systems. This learning has asymmetric value — discovering that alignment is harder than expected is much more useful before TAI arrives than after. AI-assisted safety research: sufficiently capable AI systems might dramatically accelerate alignment research itself, creating a dynamic where some capability acceleration is positive for safety. This is highly uncertain — it depends on whether near-TAI systems are capable enough to help with alignment but not yet dangerous enough to pose catastrophic risks, a potentially narrow window.

Comparative Analysis of Actions

Acceleration/Deceleration Estimates by Action Type

The following table estimates the timeline impact of various actions, along with their effects on risk and conditional value:

ActionTimeline EffectRisk EffectConditional Value EffectNet Assessment
Major capabilities breakthrough-1 to -3 years+2-8 pp x-risk+0.1-0.5% conditional valueLikely net negative unless safety is already sufficient
Major alignment breakthrough0 to -0.5 years (may speed capabilities)-3-10 pp x-risk+0.5-2% conditional valueStrongly net positive
Comprehensive AI regulation+0.5 to +2 years-1-4 pp x-risk-0.1 to +0.3% conditional valueUsually net positive, depends on racing dynamics
International compute governance+0.5 to +1 year-1-3 pp x-risk+0.1-0.3% conditional valueNet positive if enforceable
Voluntary safety commitments (RSPs)+0.1 to +0.5 years-0.5-2 pp x-risk+0.1-0.2% conditional valueModestly positive, fragile
Open-sourcing frontier models-0.5 to -1 year (via ecosystem acceleration)+1-3 pp x-risk+0.1-0.5% conditional value (democratization)Ambiguous, depends on model dangerousness
Interpretability researchRoughly neutral-0.5-3 pp x-risk+0.2-1% conditional valueNet positive
Hardware export controls+0.5 to +2 years (for affected actors)-0.5-2 pp x-risk-0.1-0.3% conditional value (global inequality)Complex, depends on target
Massive compute investment-0.5 to -2 years+1-5 pp x-risk+0.1-0.3% conditional valueUsually net negative
AI moratorium (1 year)+1 year-1-4 pp x-risk-0.1-0.3% conditional valueNet positive at current readiness levels
Loading diagram...

Worked Example: Evaluating a Capabilities Advance

Suppose a new architecture reduces compute requirements by 10x, effectively pulling TAI forward by approximately 2 years. At current readiness levels (safety ~0.20, governance ~0.15):

ComponentCalculationEstimate
Risk increase2 years * 2-4 pp/year+4-8 pp additional x-risk
Conditional value gain2 years * 0.1-0.3%/year+0.2-0.6% conditional value
Net assessmentRisk increase dominatesNet negative at current readiness

If safety readiness were instead at 0.70 (after major alignment breakthroughs):

ComponentCalculationEstimate
Risk increase2 years * 0.1-0.5 pp/year+0.2-1.0 pp additional x-risk
Conditional value gain2 years * 0.1-0.3%/year+0.2-0.6% conditional value
Net assessmentEffects roughly balancedAmbiguous — depends on value of the future

This illustrates the core insight: the same acceleration can be net positive or net negative depending on the current state of safety readiness.

Worked Example: Evaluating Regulation That Slows AI by 1 Year

ComponentCalculationEstimate
Risk reduction1 year * 2-4 pp/year-2-4 pp x-risk
Conditional value loss1 year * 0.1-0.3%/year-0.1-0.3% conditional value
Delayed benefits1 year of forgone TAI applicationsSignificant but finite
Racing riskUnilateral slowdown may shift development to less safety-conscious actors+0.5-2 pp x-risk (partially offsetting)
Net assessmentRisk reduction dominates unless racing fully offsetsUsually net positive at current readiness

Key Dynamics and Nonlinearities

The Readiness Curve

The relationship between preparation time and risk reduction is not linear. It follows a concave curve with diminishing marginal returns: the first few years of additional preparation yield the largest risk reductions (deploying known safety techniques, establishing basic governance), while later years yield progressively less (fine-tuning already-adequate measures, hardening against increasingly unlikely failure modes). This is why the current moment — when readiness is low and marginal returns are high — is so critical.

Loading diagram...

Racing Dynamics and Unilateral Action

A critical complication: actions that slow one actor may not slow the global frontier if other actors continue at full speed. This introduces a racing multiplier that reduces the effective deceleration:

ScenarioEffective DecelerationRacing MultiplierNotes
Global regulation (enforced)80-100% of nominal0.8-1.0Best case, hard to achieve
Major power agreement50-80% of nominal0.5-0.8US-China-EU coordination
Unilateral national regulation20-50% of nominal0.2-0.5Development shifts elsewhere
Single lab voluntary slowdown5-15% of nominal0.05-0.15Competitors fill gap quickly

This means that for deceleration to be effective at reducing x-risk, it must either be globally coordinated or work through mechanisms that do not merely shift development to other actors (such as compute governance that affects all actors simultaneously).

Differential Technology Development

Not all acceleration is equal. The concept of differential technology development (introduced in Bostrom 2014) distinguishes between advancing safety-relevant vs. capability-relevant technologies. The ideal is to accelerate safety research while leaving capability timelines unchanged — achieving risk reduction without the costs of delay.

Development TypeTimeline EffectX-Risk EffectDirect Cost
Pure safety accelerationNoneReducedLow
Pure capability accelerationEarlier TAIIncreasedLow (but large externality)
Mixed research (e.g., interpretability)Slightly earlier TAINet reducedLow
Infrastructure (compute, data)Earlier TAIIncreasedVariable

Sensitivity Analysis

The model's conclusions are most sensitive to the following parameters:

ParameterBase CaseIf HigherIf LowerImpact on Conclusions
Baseline x-risk15%Risk reduction more valuableRisk reduction less valuableChanges magnitude but not direction
Safety improvement rate5 pp/yearEach year of delay more valuableYears of delay less valuableCritical for net assessment
Racing multiplier0.5Unilateral action less effectiveUnilateral action more effectiveDetermines which actions work
Conditional value of futureAstronomicalRisk reduction dominates any analysisTradeoffs more balancedDetermines whether any acceleration is acceptable
Current safety readiness0.20Additional time less valuableAdditional time more valuableKey crux for near-term decisions
P(alignment is easy)20%Acceleration less dangerousAcceleration more dangerousChanges optimal strategy significantly

Scenario Analysis

ScenarioAcceleration AssessmentDeceleration AssessmentOptimal Strategy
Alignment is hard, timelines shortVery dangerousVery valuableAggressive deceleration + safety investment
Alignment is hard, timelines longDangerousValuable but less urgentSteady safety investment, prepare governance
Alignment is tractable, timelines shortModerately dangerousModerately valuableFocus on solving alignment, moderate deceleration
Alignment is tractable, timelines longRoughly neutralModest valueSolve alignment, let capabilities proceed

Implications

For AI Safety Organizations

The model implies that AI safety organizations should evaluate their work partly in terms of effective time purchased. An alignment research program that makes TAI 2 percentage points safer is equivalent to one that delays TAI by roughly 0.5-2 years (at current readiness levels), because both reduce x-risk by similar amounts. This provides a common currency for comparing research and governance work.

Safety organizations should also consider the racing multiplier when evaluating governance proposals. Proposals that only slow one actor are much less valuable than those that slow the global frontier.

For Capabilities Organizations

Capabilities organizations creating acceleration should account for the marginal x-risk created. At current readiness levels, pulling TAI forward by 1 year incurs the 1-4 pp x-risk cost described above. If the long-term future is worth quadrillions of dollars or more in expected value, this is an enormous externality.

This does not mean all capabilities work is net negative — complementary work that advances both safety and capabilities can be net positive, and acceleration becomes less costly as safety readiness improves.

For Policymakers

Regulation that slows AI development by 1 year is more valuable when safety readiness is lower (as it currently is) and less valuable as readiness improves. This suggests a dynamic regulatory approach: stricter requirements now when marginal preparation time is most valuable, gradually loosening as safety research matures and governance institutions develop capacity.

The racing multiplier is the strongest argument for international coordination: unilateral slowdowns are 2-10x less effective than coordinated ones.

For Forecasters and Funders

The model provides a framework for comparing any intervention on a common scale: how many x-risk-adjusted years does it produce? This enables portfolio optimization across very different intervention types.

Intervention TypeEffective x-risk-adjusted YearsCostNotes
Alignment research0.5-3 per breakthrough$10-100M per breakthroughHighest ceiling but depends on tractability
Compute governance0.3-1.5 globally$50-200M for implementationHigh leverage, closing window
International coordination0.2-1.0 per agreement$20-100M per agreementRanges overlap with compute governance
National regulation0.1-0.5 (racing-adjusted)$10-50M for advocacyHeavily discounted by racing multiplier
Voluntary commitments0.05-0.2$5-20MFragile, low counterfactual impact

Note that the ranges overlap substantially — the ranking is suggestive, not definitive. A high-impact compute governance intervention could outperform a marginal alignment research program. Portfolio diversification across types is likely optimal.

Key Uncertainties

Key Questions

  • ?How much does one additional year of preparation actually reduce existential risk at current readiness levels?
  • ?How large is the racing multiplier — does unilateral deceleration just shift development elsewhere?
  • ?Does acceleration ever become net positive, and if so at what safety readiness threshold?
  • ?How should we weigh the conditional value effects (earlier scientific progress, earlier solutions to other catastrophes) against x-risk increases?
  • ?Can differential technology development actually work in practice, or does safety research inevitably speed up capabilities too?

Model Limitations

What This Model Captures

This model provides a unified framework for comparing acceleration and deceleration effects across different action types. It quantifies the tradeoff between preparation time and delayed benefits, and identifies current readiness level as the key determinant of whether acceleration is net positive or negative.

What This Model Misses

Endogenous timelines: The model treats TAI arrival time as exogenous, but in reality safety research, governance, and capabilities interact in complex feedback loops. Safety breakthroughs may enable faster capability deployment; regulation may redirect rather than slow research.

Discrete vs. continuous risk: The model assumes a smooth relationship between preparation time and risk, but real risk may be concentrated around specific capability thresholds where preparation either is or is not sufficient.

Political economy: The model does not account for the political dynamics of acceleration and deceleration — who benefits, who bears costs, and how this affects the feasibility of various interventions.

Tail risks and unknown unknowns: The parameter estimates are based on current understanding. Novel alignment failure modes or unexpected capability jumps could invalidate the smooth tradeoff curves assumed here.

Heterogeneity of TAI: "Transformative AI" is not a single event. Different capabilities may arrive at different times, and the risks associated with each may vary independently.

Recursive dynamics: The model treats safety progress and capability progress as independent, but they interact. Most importantly, increasingly capable AI systems may accelerate safety research itself — meaning acceleration could simultaneously reduce preparation time and increase the rate of safety progress. The net effect of this dynamic is deeply uncertain and could change the sign of the model's conclusions for moderate acceleration.

Related Models

  • Safety-Capability Tradeoff Model — When safety and capabilities conflict vs. complement
  • Capability-Alignment Race — Quantifying the gap between capability and safety readiness
  • Intervention Timing Windows — Which interventions have closing windows
  • Racing Dynamics — How competition affects the effectiveness of deceleration
  • Intervention Effectiveness Matrix — Which interventions address which risks

Sources

Foundational Frameworks

Risk Quantification

Timeline Estimates

Governance and Regulation

Related Pages

Top Related Pages

Risks

Multipolar Trap (AI Development)

Approaches

AI Safety CasesAI Governance Coordination Technologies

Analysis

OpenAI Foundation Governance Paradox

People

Yoshua BengioStuart Russell

Labs

GovAI

Concepts

Safety-Capability GapRacing Intensity

Transition Model

Lab BehaviorAlignment ProgressHuman Oversight QualitySafety-Capability Gap

Key Debates

When Will AGI Arrive?Open vs Closed Source AI

Policy

Voluntary AI Safety CommitmentsUS Executive Order on Safe, Secure, and Trustworthy AI

Organizations

US AI Safety InstituteUK AI Safety Institute