MAIM (Mutually Assured AI Malfunction) is a deterrence framework introduced in the 2025 paper 'Superintelligence Strategy' by Dan Hendrycks (CAIS), Eric Schmidt, and Alexandr Wang. It proposes that rival states will naturally deter each other from pursuing unilateral AI dominance because destabilizing AI projects can be sabotaged through an escalation ladder ranging from espionage to kinetic strikes on datacenters. Critics from MIRI, RAND, and IAPS have raised concerns about observability challenges, unclear red lines, escalation risks, and the fundamental disanalogies between AI deterrence and nuclear MAD. Proponents argue MAIM describes the existing strategic reality and can serve as scaffolding for international cooperation and verification regimes.
MAIM (Mutually Assured AI Malfunction)
MAIM (Mutually Assured AI Malfunction)
MAIM (Mutually Assured AI Malfunction) is a deterrence framework introduced in the 2025 paper 'Superintelligence Strategy' by Dan Hendrycks (CAIS), Eric Schmidt, and Alexandr Wang. It proposes that rival states will naturally deter each other from pursuing unilateral AI dominance because destabilizing AI projects can be sabotaged through an escalation ladder ranging from espionage to kinetic strikes on datacenters. Critics from MIRI, RAND, and IAPS have raised concerns about observability challenges, unclear red lines, escalation risks, and the fundamental disanalogies between AI deterrence and nuclear MAD. Proponents argue MAIM describes the existing strategic reality and can serve as scaffolding for international cooperation and verification regimes.
MAIM (Mutually Assured AI Malfunction)
MAIM (Mutually Assured AI Malfunction) is a deterrence framework introduced in the 2025 paper 'Superintelligence Strategy' by Dan Hendrycks (CAIS), Eric Schmidt, and Alexandr Wang. It proposes that rival states will naturally deter each other from pursuing unilateral AI dominance because destabilizing AI projects can be sabotaged through an escalation ladder ranging from espionage to kinetic strikes on datacenters. Critics from MIRI, RAND, and IAPS have raised concerns about observability challenges, unclear red lines, escalation risks, and the fundamental disanalogies between AI deterrence and nuclear MAD. Proponents argue MAIM describes the existing strategic reality and can serve as scaffolding for international cooperation and verification regimes.
Overview
Mutually Assured AI Malfunction (MAIM) is a deterrence framework for managing great-power competition over advanced AI. Introduced by Dan HendrycksPersonDan HendrycksBiographical overview of Dan Hendrycks, CAIS director who coordinated the May 2023 AI risk statement signed by major AI researchers. Covers his technical work on benchmarks (MMLU, ETHICS), robustne...Quality: 19/100 (director of Center for AI SafetyOrganizationCenter for AI SafetyCAIS is a research organization that has distributed $2M+ in compute grants to 200+ researchers, published 50+ safety papers including benchmarks adopted by Anthropic/OpenAI, and organized the May ...Quality: 42/100), Eric Schmidt (former Google CEO), and Alexandr Wang (Scale AI CEO) in their March 2025 paper Superintelligence Strategy, MAIM proposes that any state's aggressive bid for unilateral AI dominance will be met with preventive sabotage by rivals.
The framework draws an analogy to nuclear Mutually Assured Destruction (MAD), but operates through preemption rather than retaliation. Because destabilizing AI projects are relatively easy to sabotage — through interventions ranging from covert cyberattacks to kinetic strikes on datacenters — the authors argue that MAIM already describes the strategic picture AI superpowers find themselves in. The resulting stalemate could postpone the emergence of superintelligence, curtail many loss-of-control scenarios, and undercut efforts to secure a strategic monopoly.
MAIM is one pillar of a broader three-part framework alongside nonproliferation (tracking AI chips and preventing rogue access) and competitiveness (guaranteeing domestic chip manufacturing capacity). The paper has generated extensive debate, with critiques from Machine Intelligence Research InstituteOrganizationMachine Intelligence Research InstituteComprehensive organizational history documenting MIRI's trajectory from pioneering AI safety research (2000-2020) to policy advocacy after acknowledging research failure, with detailed financial da...Quality: 50/100, RAND, and the Institute for AI Policy and Strategy (IAPS) raising concerns about observability, escalation risks, and the limits of the nuclear analogy.
The Three Pillars
The Superintelligence Strategy paper presents MAIM within a broader strategic framework:
| Pillar | Objective | Key Mechanisms |
|---|---|---|
| Deterrence (MAIM) | Prevent destabilizing AI projects | Espionage, sabotage, credible threat of escalation |
| Nonproliferation | Keep weaponizable AI out of rogue hands | Chip tracking, export controls, supply chain security |
| Competitiveness | Maintain national AI advantage | Domestic chip manufacturing, talent retention, R&D investment |
The deterrence pillar is most novel and controversial. The nonproliferation and competitiveness pillars build on existing policy proposals around Compute GovernancePolicyCompute GovernanceThis is a comprehensive overview of U.S. AI chip export controls policy, documenting the evolution from blanket restrictions to case-by-case licensing while highlighting significant enforcement cha...Quality: 58/100, US AI Chip Export ControlsPolicyUS AI Chip Export ControlsComprehensive empirical analysis finds US chip export controls provide 1-3 year delays on Chinese AI development but face severe enforcement gaps (140,000 GPUs smuggled in 2024, only 1 BIS officer ...Quality: 73/100, and Hardware-Enabled GovernancePolicyHardware-Enabled GovernanceRAND analysis identifies attestation-based licensing as most feasible hardware-enabled governance mechanism with 5-10 year timeline, while 100,000+ export-controlled GPUs were smuggled to China in ...Quality: 70/100.
How MAIM Works
Escalation Ladder
The framework outlines a graduated set of responses to a rival's destabilizing AI development:
| Level | Action | Description | Reversibility |
|---|---|---|---|
| 1 | Intelligence gathering | Espionage on rival AI projects and capabilities | Non-destructive |
| 2 | Covert sabotage | Insider tampering with model weights, training data, or chip fabrication | Partially reversible |
| 3 | Overt cyberattacks | Visible disruption of datacenters, power grids, or cooling systems | Moderately reversible |
| 4 | Kinetic strikes | Physical destruction of AI infrastructure and datacenters | Irreversible |
| 5 | Broader hostilities | Escalation beyond AI-specific targets | Irreversible |
Rather than waiting for a rival to weaponize a superintelligent system, states would act preemptively to disable threatening projects. The authors argue this dynamic stabilizes the strategic landscape without requiring formal treaty negotiations — all that is necessary is that states collectively recognize their strategic situation.
Deterrence Logic
The stabilizing logic requires three conditions: rivals must be able to observe destabilizing projects, they must have credible means to sabotage them, and the threat of sabotage must outweigh the expected gains from pursuing dominance.
Proposed Stabilization Policies
The paper recommends several policies to strengthen MAIM stability:
- Clarify escalation ladders: Establish common knowledge about maiming readiness to prevent misinterpretation of rival actions
- Prevent chip smuggling: Keep decisions about AI development with rational state actors rather than rogue regimes
- Remote datacenter placement: Follow the "city avoidance" principle from the nuclear era, reducing collateral damage from potential strikes
- Transparency and verification: Mutual inspection regimes to reduce false-positive sabotage attacks
- AI-assisted inspections: Deploy "confidentiality-preserving AI verifiers" that can confirm compliance without revealing proprietary details
Differences from Nuclear MAD
While Superintelligence Strategy draws a pedagogical parallel between MAIM and MAD, the authors acknowledge these are structurally different frameworks:
| Dimension | Nuclear MAD | AI MAIM |
|---|---|---|
| Mechanism | Retaliation after attack | Preemption before dominance |
| Observability | Relatively high (satellite imagery, seismic detection) | Low (AI development behind closed doors) |
| Subject behavior | Weapons are inert tools | AI systems can adapt and evolve |
| Attribution | Generally clear (missile launches detectable) | Difficult (cyberattacks hard to attribute) |
| Escalation risk | Well-understood doctrine | Novel and untested |
| Red lines | Clear (nuclear use) | Ambiguous (what counts as "destabilizing"?) |
Hendrycks and Khoja later clarified that the analogy was somewhat loose and the MAIM argument stands on its own merits, independent of how closely it mirrors MAD.
Major Critiques
Observability Problem
A critique published on AI Frontiers highlights that MAIM hinges on nations observing one another's progress toward superintelligence. AI development happens behind closed doors with breakthroughs often concealed as proprietary secrets. This creates two dangerous failure modes: missing important signs of advancement, or misinterpreting normal activity as a threat and triggering unnecessary sabotage.
MIRI's Formal Analysis
Machine Intelligence Research InstituteOrganizationMachine Intelligence Research InstituteComprehensive organizational history documenting MIRI's trajectory from pioneering AI safety research (2000-2020) to policy advocacy after acknowledging research failure, with detailed financial da...Quality: 50/100 published a detailed analysis applying formal deterrence theory to MAIM, finding:
- Unclear red lines: What constitutes a "destabilizing AI project" is ambiguous and difficult to monitor
- Questionable credibility: Sabotage likely only delays rather than denies rival capabilities, weakening the deterrent
- Timing problems: Intelligence recursion might proceed too quickly to be identified and responded to
- Volatile calculus: Immense stakes and uncertainty make deterrence calculations unpredictable
MIRI proposed an alternative regime centered on earlier, more monitorable red lines.
RAND Assessment
RAND noted that the paper makes a critical contribution to the AI policy debate, but the MAIM world described is "entirely inconsistent with the current reality of private sector-driven AI development." The gap between the state-centric deterrence model and the actual landscape of private AI labs raises implementation questions.
Escalation and Moral Hazard Concerns
Additional critiques include:
- Escalation risk: Strikes on AI infrastructure could be perceived as acts of war, since AI infrastructure is deeply intertwined with economic and military power
- Moral hazard: Accepting AI malfunction as a strategic tool could lower ethical standards and reduce investment in proactive safety measures
- Asymmetric perceptions: China may view US chip restrictions as more threatening than American policymakers realize, undermining stable deterrence
- Attribution challenges: Difficulty attributing cyberattacks creates risk of miscalculation and overreaction
IAPS Stability Analysis
The Institute for AI Policy and Strategy analyzed whether a MAIM regime could remain stable long enough for superintelligence to be developed safely. They concluded this depends crucially on verification — without confidence that rivals are complying, each side faces pressure to defect first.
Relationship to Other Frameworks
MAIM intersects with several existing governance proposals:
- Compute GovernancePolicyCompute GovernanceThis is a comprehensive overview of U.S. AI chip export controls policy, documenting the evolution from blanket restrictions to case-by-case licensing while highlighting significant enforcement cha...Quality: 58/100: The nonproliferation pillar relies on controlling access to AI-relevant compute
- US AI Chip Export ControlsPolicyUS AI Chip Export ControlsComprehensive empirical analysis finds US chip export controls provide 1-3 year delays on Chinese AI development but face severe enforcement gaps (140,000 GPUs smuggled in 2024, only 1 BIS officer ...Quality: 73/100: Current US chip restrictions to China are a precursor to MAIM-style nonproliferation
- International Coordination MechanismsPolicyInternational Coordination MechanismsComprehensive analysis of international AI coordination mechanisms shows growing but limited progress: 11-country AI Safety Institute network with ~$200M budget expanding to include India; Council ...Quality: 91/100: MAIM's escalation ladders and verification proposals complement international coordination efforts
- AI Development Racing DynamicsRiskAI Development Racing DynamicsRacing dynamics analysis shows competitive pressure has shortened safety evaluation timelines by 40-60% since ChatGPT's launch, with commercial labs reducing safety work from 12 weeks to 4-6 weeks....Quality: 72/100: MAIM attempts to address the same competitive pressures that drive AI Development Racing DynamicsRiskAI Development Racing DynamicsRacing dynamics analysis shows competitive pressure has shortened safety evaluation timelines by 40-60% since ChatGPT's launch, with commercial labs reducing safety work from 12 weeks to 4-6 weeks....Quality: 72/100
- Pause AdvocacyApproachPause AdvocacyComprehensive analysis of pause advocacy as an AI safety intervention, estimating 15-40% probability of meaningful policy implementation by 2030 with potential to provide 2-5 years of additional sa...Quality: 91/100: Some critics argue that advocating for compute pauses or development moratoria would be more effective than deterrence
Path Toward Cooperation
Proponents argue MAIM is not merely a framework for mutual threat but a stepping stone toward deeper cooperation:
- States would prefer mutual visibility and leverage over an all-out race
- MAIM's tools — escalation ladders, transparency mechanisms, and verification regimes — provide scaffolding for legitimately enforceable international agreements
- Deterrence can begin with unilateral capabilities and mature into a system of international verification
- The framework could eventually evolve into something resembling International Compute RegimesPolicyInternational Compute RegimesComprehensive analysis of international AI compute governance finds 10-25% chance of meaningful regimes by 2035, but potential for 30-60% reduction in racing dynamics if achieved. First binding tre...Quality: 67/100 for AI governance
Key Uncertainties
| Uncertainty | Impact on MAIM Viability | Current Assessment |
|---|---|---|
| Observability of AI progress | Foundational — MAIM fails without it | Low confidence in current monitoring |
| Speed of intelligence recursion | Determines response window | Highly contested among experts |
| Private sector vs. state control | Affects who makes deterrence decisions | Current development is private-sector-led |
| Attribution capability | Required for proportionate response | Inadequate for cyber domain |
| Stability of equilibrium | Determines long-term viability | Unclear — no historical precedent |
Sources and Further Reading
- Hendrycks, Schmidt, Wang. "Superintelligence Strategy" (arXiv:2503.05628, March 2025)
- Hendrycks and Khoja. "AI Deterrence Is Our Best Option" (AI Frontiers, September 2025)
- MIRI. "Refining MAIM: Identifying Changes Required to Meet Conditions for Deterrence" (April 2025)
- "Superintelligence Deterrence Has an Observability Problem" (AI Frontiers)
- RAND. "Seeking Stability in the Competition for AI Advantage" (March 2025)
- IAPS. "Crucial Considerations in ASI Deterrence"
- Full paper website: nationalsecurity.ai