Longterm Wiki
Updated 2026-02-12HistoryData
Page StatusResponse
Edited 1 day ago1.4k words
55
QualityAdequate
72
ImportanceHigh
14
Structure14/15
4114013%26%
Updated bimonthlyDue in 8 weeks
Summary

MAIM (Mutually Assured AI Malfunction) is a deterrence framework introduced in the 2025 paper 'Superintelligence Strategy' by Dan Hendrycks (CAIS), Eric Schmidt, and Alexandr Wang. It proposes that rival states will naturally deter each other from pursuing unilateral AI dominance because destabilizing AI projects can be sabotaged through an escalation ladder ranging from espionage to kinetic strikes on datacenters. Critics from MIRI, RAND, and IAPS have raised concerns about observability challenges, unclear red lines, escalation risks, and the fundamental disanalogies between AI deterrence and nuclear MAD. Proponents argue MAIM describes the existing strategic reality and can serve as scaffolding for international cooperation and verification regimes.

Issues1
QualityRated 55 but structure suggests 93 (underrated by 38 points)

MAIM (Mutually Assured AI Malfunction)

Policy

MAIM (Mutually Assured AI Malfunction)

MAIM (Mutually Assured AI Malfunction) is a deterrence framework introduced in the 2025 paper 'Superintelligence Strategy' by Dan Hendrycks (CAIS), Eric Schmidt, and Alexandr Wang. It proposes that rival states will naturally deter each other from pursuing unilateral AI dominance because destabilizing AI projects can be sabotaged through an escalation ladder ranging from espionage to kinetic strikes on datacenters. Critics from MIRI, RAND, and IAPS have raised concerns about observability challenges, unclear red lines, escalation risks, and the fundamental disanalogies between AI deterrence and nuclear MAD. Proponents argue MAIM describes the existing strategic reality and can serve as scaffolding for international cooperation and verification regimes.

Related
People
Dan Hendrycks
Organizations
Center for AI Safety
Risks
AI Development Racing Dynamics
Policies
Compute GovernanceUS AI Chip Export ControlsInternational Compute RegimesInternational Coordination Mechanisms
1.4k words
Policy

MAIM (Mutually Assured AI Malfunction)

MAIM (Mutually Assured AI Malfunction) is a deterrence framework introduced in the 2025 paper 'Superintelligence Strategy' by Dan Hendrycks (CAIS), Eric Schmidt, and Alexandr Wang. It proposes that rival states will naturally deter each other from pursuing unilateral AI dominance because destabilizing AI projects can be sabotaged through an escalation ladder ranging from espionage to kinetic strikes on datacenters. Critics from MIRI, RAND, and IAPS have raised concerns about observability challenges, unclear red lines, escalation risks, and the fundamental disanalogies between AI deterrence and nuclear MAD. Proponents argue MAIM describes the existing strategic reality and can serve as scaffolding for international cooperation and verification regimes.

Related
People
Dan Hendrycks
Organizations
Center for AI Safety
Risks
AI Development Racing Dynamics
Policies
Compute GovernanceUS AI Chip Export ControlsInternational Compute RegimesInternational Coordination Mechanisms
1.4k words

Overview

Mutually Assured AI Malfunction (MAIM) is a deterrence framework for managing great-power competition over advanced AI. Introduced by Dan Hendrycks (director of Center for AI Safety), Eric Schmidt (former Google CEO), and Alexandr Wang (Scale AI CEO) in their March 2025 paper Superintelligence Strategy, MAIM proposes that any state's aggressive bid for unilateral AI dominance will be met with preventive sabotage by rivals.

The framework draws an analogy to nuclear Mutually Assured Destruction (MAD), but operates through preemption rather than retaliation. Because destabilizing AI projects are relatively easy to sabotage — through interventions ranging from covert cyberattacks to kinetic strikes on datacenters — the authors argue that MAIM already describes the strategic picture AI superpowers find themselves in. The resulting stalemate could postpone the emergence of superintelligence, curtail many loss-of-control scenarios, and undercut efforts to secure a strategic monopoly.

MAIM is one pillar of a broader three-part framework alongside nonproliferation (tracking AI chips and preventing rogue access) and competitiveness (guaranteeing domestic chip manufacturing capacity). The paper has generated extensive debate, with critiques from Machine Intelligence Research Institute, RAND, and the Institute for AI Policy and Strategy (IAPS) raising concerns about observability, escalation risks, and the limits of the nuclear analogy.

The Three Pillars

The Superintelligence Strategy paper presents MAIM within a broader strategic framework:

PillarObjectiveKey Mechanisms
Deterrence (MAIM)Prevent destabilizing AI projectsEspionage, sabotage, credible threat of escalation
NonproliferationKeep weaponizable AI out of rogue handsChip tracking, export controls, supply chain security
CompetitivenessMaintain national AI advantageDomestic chip manufacturing, talent retention, R&D investment

The deterrence pillar is most novel and controversial. The nonproliferation and competitiveness pillars build on existing policy proposals around Compute Governance, US AI Chip Export Controls, and Hardware-Enabled Governance.

How MAIM Works

Escalation Ladder

The framework outlines a graduated set of responses to a rival's destabilizing AI development:

LevelActionDescriptionReversibility
1Intelligence gatheringEspionage on rival AI projects and capabilitiesNon-destructive
2Covert sabotageInsider tampering with model weights, training data, or chip fabricationPartially reversible
3Overt cyberattacksVisible disruption of datacenters, power grids, or cooling systemsModerately reversible
4Kinetic strikesPhysical destruction of AI infrastructure and datacentersIrreversible
5Broader hostilitiesEscalation beyond AI-specific targetsIrreversible

Rather than waiting for a rival to weaponize a superintelligent system, states would act preemptively to disable threatening projects. The authors argue this dynamic stabilizes the strategic landscape without requiring formal treaty negotiations — all that is necessary is that states collectively recognize their strategic situation.

Deterrence Logic

Loading diagram...

The stabilizing logic requires three conditions: rivals must be able to observe destabilizing projects, they must have credible means to sabotage them, and the threat of sabotage must outweigh the expected gains from pursuing dominance.

Proposed Stabilization Policies

The paper recommends several policies to strengthen MAIM stability:

  • Clarify escalation ladders: Establish common knowledge about maiming readiness to prevent misinterpretation of rival actions
  • Prevent chip smuggling: Keep decisions about AI development with rational state actors rather than rogue regimes
  • Remote datacenter placement: Follow the "city avoidance" principle from the nuclear era, reducing collateral damage from potential strikes
  • Transparency and verification: Mutual inspection regimes to reduce false-positive sabotage attacks
  • AI-assisted inspections: Deploy "confidentiality-preserving AI verifiers" that can confirm compliance without revealing proprietary details

Differences from Nuclear MAD

While Superintelligence Strategy draws a pedagogical parallel between MAIM and MAD, the authors acknowledge these are structurally different frameworks:

DimensionNuclear MADAI MAIM
MechanismRetaliation after attackPreemption before dominance
ObservabilityRelatively high (satellite imagery, seismic detection)Low (AI development behind closed doors)
Subject behaviorWeapons are inert toolsAI systems can adapt and evolve
AttributionGenerally clear (missile launches detectable)Difficult (cyberattacks hard to attribute)
Escalation riskWell-understood doctrineNovel and untested
Red linesClear (nuclear use)Ambiguous (what counts as "destabilizing"?)

Hendrycks and Khoja later clarified that the analogy was somewhat loose and the MAIM argument stands on its own merits, independent of how closely it mirrors MAD.

Major Critiques

Observability Problem

A critique published on AI Frontiers highlights that MAIM hinges on nations observing one another's progress toward superintelligence. AI development happens behind closed doors with breakthroughs often concealed as proprietary secrets. This creates two dangerous failure modes: missing important signs of advancement, or misinterpreting normal activity as a threat and triggering unnecessary sabotage.

MIRI's Formal Analysis

Machine Intelligence Research Institute published a detailed analysis applying formal deterrence theory to MAIM, finding:

  • Unclear red lines: What constitutes a "destabilizing AI project" is ambiguous and difficult to monitor
  • Questionable credibility: Sabotage likely only delays rather than denies rival capabilities, weakening the deterrent
  • Timing problems: Intelligence recursion might proceed too quickly to be identified and responded to
  • Volatile calculus: Immense stakes and uncertainty make deterrence calculations unpredictable

MIRI proposed an alternative regime centered on earlier, more monitorable red lines.

RAND Assessment

RAND noted that the paper makes a critical contribution to the AI policy debate, but the MAIM world described is "entirely inconsistent with the current reality of private sector-driven AI development." The gap between the state-centric deterrence model and the actual landscape of private AI labs raises implementation questions.

Escalation and Moral Hazard Concerns

Additional critiques include:

  • Escalation risk: Strikes on AI infrastructure could be perceived as acts of war, since AI infrastructure is deeply intertwined with economic and military power
  • Moral hazard: Accepting AI malfunction as a strategic tool could lower ethical standards and reduce investment in proactive safety measures
  • Asymmetric perceptions: China may view US chip restrictions as more threatening than American policymakers realize, undermining stable deterrence
  • Attribution challenges: Difficulty attributing cyberattacks creates risk of miscalculation and overreaction

IAPS Stability Analysis

The Institute for AI Policy and Strategy analyzed whether a MAIM regime could remain stable long enough for superintelligence to be developed safely. They concluded this depends crucially on verification — without confidence that rivals are complying, each side faces pressure to defect first.

Relationship to Other Frameworks

MAIM intersects with several existing governance proposals:

  • Compute Governance: The nonproliferation pillar relies on controlling access to AI-relevant compute
  • US AI Chip Export Controls: Current US chip restrictions to China are a precursor to MAIM-style nonproliferation
  • International Coordination Mechanisms: MAIM's escalation ladders and verification proposals complement international coordination efforts
  • AI Development Racing Dynamics: MAIM attempts to address the same competitive pressures that drive AI Development Racing Dynamics
  • Pause Advocacy: Some critics argue that advocating for compute pauses or development moratoria would be more effective than deterrence

Path Toward Cooperation

Proponents argue MAIM is not merely a framework for mutual threat but a stepping stone toward deeper cooperation:

  • States would prefer mutual visibility and leverage over an all-out race
  • MAIM's tools — escalation ladders, transparency mechanisms, and verification regimes — provide scaffolding for legitimately enforceable international agreements
  • Deterrence can begin with unilateral capabilities and mature into a system of international verification
  • The framework could eventually evolve into something resembling International Compute Regimes for AI governance

Key Uncertainties

UncertaintyImpact on MAIM ViabilityCurrent Assessment
Observability of AI progressFoundational — MAIM fails without itLow confidence in current monitoring
Speed of intelligence recursionDetermines response windowHighly contested among experts
Private sector vs. state controlAffects who makes deterrence decisionsCurrent development is private-sector-led
Attribution capabilityRequired for proportionate responseInadequate for cyber domain
Stability of equilibriumDetermines long-term viabilityUnclear — no historical precedent

Sources and Further Reading

Related Pages

Top Related Pages

Labs

Center for AI SafetySafe Superintelligence Inc.

Risks

Cyberweapons Risk

People

Dan HendrycksIlya Sutskever

Analysis

AI Safety Multi-Actor Strategic Landscape

Concepts

US AI Chip Export ControlsMachine Intelligence Research InstituteAI Development Racing DynamicsCenter for AI SafetyCompute GovernanceInternational Coordination Mechanisms

Transition Model

Geopolitics

Models

Authoritarian Tools Diffusion Model

Historical

The MIRI Era