Page StatusResponse

Edited 1 day ago1.4k words

Updated bimonthlyDue in 8 weeks

Summary

MAIM (Mutually Assured AI Malfunction) is a deterrence framework introduced in the 2025 paper 'Superintelligence Strategy' by Dan Hendrycks (CAIS), Eric Schmidt, and Alexandr Wang. It proposes that rival states will naturally deter each other from pursuing unilateral AI dominance because destabilizing AI projects can be sabotaged through an escalation ladder ranging from espionage to kinetic strikes on datacenters. Critics from MIRI, RAND, and IAPS have raised concerns about observability challenges, unclear red lines, escalation risks, and the fundamental disanalogies between AI deterrence and nuclear MAD. Proponents argue MAIM describes the existing strategic reality and can serve as scaffolding for international cooperation and verification regimes.

Issues1

QualityRated 55 but structure suggests 93 (underrated by 38 points)

MAIM (Mutually Assured AI Malfunction)

Policy

MAIM (Mutually Assured AI Malfunction)

People

Organizations

Risks

Policies

1.4k words

Policy

MAIM (Mutually Assured AI Malfunction)

People

Organizations

Risks

Policies

1.4k words

Overview

Mutually Assured AI Malfunction (MAIM) is a deterrence framework for managing great-power competition over advanced AI. Introduced by Dan Hendrycks (director of Center for AI Safety), Eric Schmidt (former Google CEO), and Alexandr Wang (Scale AI CEO) in their March 2025 paper Superintelligence Strategy, MAIM proposes that any state's aggressive bid for unilateral AI dominance will be met with preventive sabotage by rivals.

The framework draws an analogy to nuclear Mutually Assured Destruction (MAD), but operates through preemption rather than retaliation. Because destabilizing AI projects are relatively easy to sabotage — through interventions ranging from covert cyberattacks to kinetic strikes on datacenters — the authors argue that MAIM already describes the strategic picture AI superpowers find themselves in. The resulting stalemate could postpone the emergence of superintelligence, curtail many loss-of-control scenarios, and undercut efforts to secure a strategic monopoly.

MAIM is one pillar of a broader three-part framework alongside nonproliferation (tracking AI chips and preventing rogue access) and competitiveness (guaranteeing domestic chip manufacturing capacity). The paper has generated extensive debate, with critiques from Machine Intelligence Research Institute, RAND, and the Institute for AI Policy and Strategy (IAPS) raising concerns about observability, escalation risks, and the limits of the nuclear analogy.

The Three Pillars

The Superintelligence Strategy paper presents MAIM within a broader strategic framework:

Pillar	Objective	Key Mechanisms
Deterrence (MAIM)	Prevent destabilizing AI projects	Espionage, sabotage, credible threat of escalation
Nonproliferation	Keep weaponizable AI out of rogue hands	Chip tracking, export controls, supply chain security
Competitiveness	Maintain national AI advantage	Domestic chip manufacturing, talent retention, R&D investment

The deterrence pillar is most novel and controversial. The nonproliferation and competitiveness pillars build on existing policy proposals around Compute Governance, US AI Chip Export Controls, and Hardware-Enabled Governance.

How MAIM Works

Escalation Ladder

The framework outlines a graduated set of responses to a rival's destabilizing AI development:

Level	Action	Description	Reversibility
1	Intelligence gathering	Espionage on rival AI projects and capabilities	Non-destructive
2	Covert sabotage	Insider tampering with model weights, training data, or chip fabrication	Partially reversible
3	Overt cyberattacks	Visible disruption of datacenters, power grids, or cooling systems	Moderately reversible
4	Kinetic strikes	Physical destruction of AI infrastructure and datacenters	Irreversible
5	Broader hostilities	Escalation beyond AI-specific targets	Irreversible

Rather than waiting for a rival to weaponize a superintelligent system, states would act preemptively to disable threatening projects. The authors argue this dynamic stabilizes the strategic landscape without requiring formal treaty negotiations — all that is necessary is that states collectively recognize their strategic situation.

Deterrence Logic

Loading diagram...

The stabilizing logic requires three conditions: rivals must be able to observe destabilizing projects, they must have credible means to sabotage them, and the threat of sabotage must outweigh the expected gains from pursuing dominance.

Proposed Stabilization Policies

The paper recommends several policies to strengthen MAIM stability:

Clarify escalation ladders: Establish common knowledge about maiming readiness to prevent misinterpretation of rival actions
Prevent chip smuggling: Keep decisions about AI development with rational state actors rather than rogue regimes
Remote datacenter placement: Follow the "city avoidance" principle from the nuclear era, reducing collateral damage from potential strikes
Transparency and verification: Mutual inspection regimes to reduce false-positive sabotage attacks
AI-assisted inspections: Deploy "confidentiality-preserving AI verifiers" that can confirm compliance without revealing proprietary details

Differences from Nuclear MAD

While Superintelligence Strategy draws a pedagogical parallel between MAIM and MAD, the authors acknowledge these are structurally different frameworks:

Dimension	Nuclear MAD	AI MAIM
Mechanism	Retaliation after attack	Preemption before dominance
Observability	Relatively high (satellite imagery, seismic detection)	Low (AI development behind closed doors)
Subject behavior	Weapons are inert tools	AI systems can adapt and evolve
Attribution	Generally clear (missile launches detectable)	Difficult (cyberattacks hard to attribute)
Escalation risk	Well-understood doctrine	Novel and untested
Red lines	Clear (nuclear use)	Ambiguous (what counts as "destabilizing"?)

Hendrycks and Khoja later clarified that the analogy was somewhat loose and the MAIM argument stands on its own merits, independent of how closely it mirrors MAD.

Major Critiques

Observability Problem

A critique published on AI Frontiers highlights that MAIM hinges on nations observing one another's progress toward superintelligence. AI development happens behind closed doors with breakthroughs often concealed as proprietary secrets. This creates two dangerous failure modes: missing important signs of advancement, or misinterpreting normal activity as a threat and triggering unnecessary sabotage.

MIRI's Formal Analysis

Machine Intelligence Research Institute published a detailed analysis applying formal deterrence theory to MAIM, finding:

Unclear red lines: What constitutes a "destabilizing AI project" is ambiguous and difficult to monitor
Questionable credibility: Sabotage likely only delays rather than denies rival capabilities, weakening the deterrent
Timing problems: Intelligence recursion might proceed too quickly to be identified and responded to
Volatile calculus: Immense stakes and uncertainty make deterrence calculations unpredictable

MIRI proposed an alternative regime centered on earlier, more monitorable red lines.

RAND Assessment

RAND noted that the paper makes a critical contribution to the AI policy debate, but the MAIM world described is "entirely inconsistent with the current reality of private sector-driven AI development." The gap between the state-centric deterrence model and the actual landscape of private AI labs raises implementation questions.

Escalation and Moral Hazard Concerns

Additional critiques include:

Escalation risk: Strikes on AI infrastructure could be perceived as acts of war, since AI infrastructure is deeply intertwined with economic and military power
Moral hazard: Accepting AI malfunction as a strategic tool could lower ethical standards and reduce investment in proactive safety measures
Asymmetric perceptions: China may view US chip restrictions as more threatening than American policymakers realize, undermining stable deterrence
Attribution challenges: Difficulty attributing cyberattacks creates risk of miscalculation and overreaction

IAPS Stability Analysis

The Institute for AI Policy and Strategy analyzed whether a MAIM regime could remain stable long enough for superintelligence to be developed safely. They concluded this depends crucially on verification — without confidence that rivals are complying, each side faces pressure to defect first.

Relationship to Other Frameworks

MAIM intersects with several existing governance proposals:

Compute Governance: The nonproliferation pillar relies on controlling access to AI-relevant compute
US AI Chip Export Controls: Current US chip restrictions to China are a precursor to MAIM-style nonproliferation
International Coordination Mechanisms: MAIM's escalation ladders and verification proposals complement international coordination efforts
AI Development Racing Dynamics: MAIM attempts to address the same competitive pressures that drive AI Development Racing Dynamics
Pause Advocacy: Some critics argue that advocating for compute pauses or development moratoria would be more effective than deterrence

Path Toward Cooperation

Proponents argue MAIM is not merely a framework for mutual threat but a stepping stone toward deeper cooperation:

States would prefer mutual visibility and leverage over an all-out race
MAIM's tools — escalation ladders, transparency mechanisms, and verification regimes — provide scaffolding for legitimately enforceable international agreements
Deterrence can begin with unilateral capabilities and mature into a system of international verification
The framework could eventually evolve into something resembling International Compute Regimes for AI governance

Key Uncertainties

Uncertainty	Impact on MAIM Viability	Current Assessment
Observability of AI progress	Foundational — MAIM fails without it	Low confidence in current monitoring
Speed of intelligence recursion	Determines response window	Highly contested among experts
Private sector vs. state control	Affects who makes deterrence decisions	Current development is private-sector-led
Attribution capability	Required for proportionate response	Inadequate for cyber domain
Stability of equilibrium	Determines long-term viability	Unclear — no historical precedent

Sources and Further Reading

Hendrycks, Schmidt, Wang. "Superintelligence Strategy" (arXiv:2503.05628, March 2025)
Hendrycks and Khoja. "AI Deterrence Is Our Best Option" (AI Frontiers, September 2025)
MIRI. "Refining MAIM: Identifying Changes Required to Meet Conditions for Deterrence" (April 2025)
"Superintelligence Deterrence Has an Observability Problem" (AI Frontiers)
RAND. "Seeking Stability in the Competition for AI Advantage" (March 2025)
IAPS. "Crucial Considerations in ASI Deterrence"
Full paper website: nationalsecurity.ai

MAIM (Mutually Assured AI Malfunction)

MAIM (Mutually Assured AI Malfunction)

MAIM (Mutually Assured AI Malfunction)

Overview

The Three Pillars

How MAIM Works

Escalation Ladder

Deterrence Logic

Proposed Stabilization Policies

Differences from Nuclear MAD

Major Critiques

Observability Problem

MIRI's Formal Analysis

RAND Assessment

Escalation and Moral Hazard Concerns

IAPS Stability Analysis

Relationship to Other Frameworks

Path Toward Cooperation

Key Uncertainties

Sources and Further Reading

Related Pages

Top Related Pages

International Coordination Mechanisms

International Compute Regimes

US AI Chip Export Controls

AI Development Racing Dynamics

Compute Governance

Labs

Risks

People

Analysis

Concepts

Transition Model

Models

Historical