Pause Advocacy
Pause Advocacy
Comprehensive analysis of pause advocacy as an AI safety intervention, estimating 15-40% probability of meaningful policy implementation by 2030 with potential to provide 2-5 years of additional safety research time. Evaluates tractability (25-35%), political feasibility (15-25%), and risks across multiple dimensions with quantified assessments, though implementation faces formidable challenges from economic incentives and geopolitical competition.
Quick Assessment
| Dimension | Assessment | Evidence |
|---|---|---|
| Tractability | Low-Medium (25-35%) | Major lobbying opposition; geopolitical competition; Biden's EO 14110↗📖 reference★★★☆☆WikipediaExecutive Order 14110: Safe, Secure, and Trustworthy Development and Use of Artificial IntelligenceThis Wikipedia article summarizes Biden's landmark 2023 AI executive order, which was a major US policy milestone before being revoked by Executive Order 14179 under the Trump administration in January 2025.Executive Order 14110, signed by President Biden in October 2023, established a comprehensive federal framework for AI governance in the United States. It directed agencies to d...governancepolicyai-safetydeployment+4Source ↗ rescinded January 2025 |
| Potential Impact | High if achieved | Could provide 2-5 years for alignment research; Asilomar precedent↗📖 reference★★★☆☆WikipediaAsilomar Conference on Recombinant DNA (1975)Frequently referenced in AI safety and governance discussions as a historical analogy for how the AI community might self-regulate transformative and potentially dangerous technologies before harms materialize.The 1975 Asilomar Conference brought together ~140 scientists, lawyers, and physicians to voluntarily pause and regulate recombinant DNA research due to potential biohazards. It...governancepolicycoordinationexistential-risk+2Source ↗ shows scientific pauses can work |
| Political Feasibility | Low (15-25%) | Only 30,000 signed FLI open letter↗🔗 web★★★☆☆Future of Life InstitutePause Giant AI Experiments: An Open Letter (FLI, 2023)A landmark public advocacy document signed by prominent researchers and figures in 2023; represents a major moment in public AI governance debate, though critics questioned its enforceability and some signatories later distanced themselves from its framing.A widely-signed open letter published by the Future of Life Institute in March 2023, calling on all AI labs to pause for at least 6 months the training of AI systems more powerf...ai-safetygovernancepolicyexistential-risk+4Source ↗; industry opposition strong |
| International Coordination | Very Low (10-20%) | China developing own AI governance framework↗🔗 web★★★★☆Carnegie EndowmentAI governance frameworkUseful for understanding how China's government frames AI safety risks and mitigation strategies; relevant for international AI governance coordination and comparing Eastern vs. Western regulatory approaches to frontier AI risks.Analysis of China's AI Safety Governance Framework 2.0, released by the Cyberspace Administration of China's standards bodies in September 2025. The framework reveals China's ev...governancepolicyai-safetyregulation+5Source ↗; US-China competition intense |
| Time Gained if Successful | 2-5 years | Based on proposed 6-month to multi-year pause durations |
| Risk of Backfire | Moderate (30-40%) | Compute overhang; ceding leadership to less safety-conscious actors |
| Advocacy Momentum | Growing | PauseAI↗🔗 webPauseAI - Movement to Pause Advanced AI DevelopmentPauseAI represents a prominent activist wing of the AI safety movement; useful for understanding the 'pause' strategic perspective and current advocacy efforts, though distinct from technical alignment research approaches.PauseAI is an advocacy movement calling for an international pause on the development of advanced AI systems until adequate safety measures and governance frameworks are in plac...governancepolicyexistential-riskcoordination+4Source ↗ protests in 13+ countries; FLI 2025 survey: 64% support superintelligence ban until proven safe |
| Public Support for Regulation | Very High (97%) | Gallup/SCSP September 2025: 97% agree AI safety should be subject to rules; 69% say government not doing enough |
| Economic Stakes | $13-200T by 2030 | McKinsey estimates $13T annual contribution; global AI investment reached $200B in 2025 |
Risks Addressed
| Risk | Mechanism | Effectiveness |
|---|---|---|
| Racing Dynamics | Reduces competitive pressure, creates coordination window | High if achieved |
| Loss of Control | Buys time for alignment research | High if alignment is tractable |
| Misuse Risks | Delays deployment of dual-use capabilities | Medium |
| Lock-in | Provides time for governance frameworks | Medium |
| Epistemic Risks | Reduces rushed deployment of unreliable systems | Medium |
Overview
Pause advocacy represents one of the most controversial interventions in AI safety: calling for the deliberate slowing or temporary halting of frontier AI development until adequate safety measures can be implemented. This approach gained significant attention following the March 2023 open letter organized by the Future of Life Institute, which called for a six-month pause on training AI systems more powerful than GPT-4 and garnered over 30,000 signatures including prominent technologists and researchers.
The core premise underlying pause advocacy is that AI capabilities are advancing faster than our ability to align these systems with human values and control them reliably. Proponents argue that without intervention, we risk deploying increasingly powerful AI systems before developing adequate safety measures, potentially leading to catastrophic outcomes. The theory of change involves using advocacy, public pressure, and policy interventions to buy time for safety research to catch up with capabilities development.
However, pause advocacy faces formidable challenges. The economic incentives driving AI development are substantial—U.S. AI investment exceeded $109 billion in 2024, while China has committed over $140 billion in state investment through the National Integrated Circuit Industry Investment Fund. Goldman Sachs research showed that 40% of U.S. real GDP growth in Q1 2025 came from tech capital expenditures, with the majority being AI-related. Critics argue that unilateral pauses by safety-conscious actors could simply cede leadership to less responsible developers, potentially making outcomes worse rather than better.
Economic Context for Pause Advocacy
| Metric | Value | Source | Implication for Pause |
|---|---|---|---|
| U.S. AI Investment (2024) | $109 billion | Microsoft AI Economy Institute | Massive economic interests opposing pause |
| China State AI Investment | $140+ billion | National IC Industry Investment Fund | Geopolitical competition intensifies |
| EU AI Investment (2024) | less than $20 billion | Global AI Adoption Report | Europe trails, may support coordination |
| GPT-5 Training Cost | $5 billion (reported) | Industry estimates | Rising costs could concentrate development |
| AI Share of GDP Growth (Q1 2025) | 40% of U.S. real GDP growth | Goldman Sachs | Economic dependence on AI momentum |
| McKinsey Projected Annual Value | $13 trillion by 2030 | McKinsey Global Institute | Opportunity cost of pause is substantial |
| Average AI Breach Cost (2025) | $4.8 million per incident | IBM Cost of Data Breach Report | Safety failures also have economic costs |
Evolution of Pause Proposals
The pause advocacy movement has evolved significantly since 2023, with proposals ranging from temporary moratoria to conditional bans on superintelligence development.
| Proposal | Date | Scope | Signatories | Key Demand | Status |
|---|---|---|---|---|---|
| FLI Open Letter | March 2023 | 6-month pause on GPT-4+ training | 30,000+ | Voluntary moratorium | Ignored by labs |
| CAIS Statement | May 2023 | Risk acknowledgment | 350+ researchers | Recognition of extinction risk | Influenced discourse |
| Statement on Superintelligence | October 2025 | Conditional ban on superintelligence | 700+ (Nobel laureates, public figures) | Prohibition until "broad scientific consensus" on safety | Active campaign |
| PauseAI Policy Proposal | Ongoing | International treaty + AI safety agency | Grassroots movement | IAEA-like body for AI | Advocacy stage |
The October 2025 "Statement on Superintelligence" represents a notable escalation from the 2023 letter. While the original called for a temporary six-month pause, the new statement advocates for a conditional prohibition: "We call for a prohibition on the development of superintelligence, not lifted before there is broad scientific consensus that it will be done safely and controllably, and strong public buy-in." Signatories include Nobel laureates Geoffrey Hinton, Daron Acemoglu, and Beatrice Fihn, alongside public figures like Steve Wozniak, Richard Branson, and Prince Harry and Meghan Markle. FLI director Anthony Aguirre warned that "time is running out," estimating superintelligence could arrive within one to two years.
Theory of Change
Diagram (loading…)
flowchart TD A[Public Advocacy] --> B[Media Coverage] A --> C[Grassroots Organizing] B --> D[Public Opinion Shift] C --> D D --> E[Political Pressure] E --> F[Legislative Action] E --> G[Industry Self-Regulation] F --> H[Compute Governance] G --> I[Responsible Scaling Policies] H --> J[Slower Development] I --> J J --> K[Time for Safety Research] K --> L[Reduced Catastrophic Risk] style A fill:#cce5ff style D fill:#fff3cd style F fill:#d4edda style G fill:#d4edda style J fill:#c3e6cb style L fill:#98d9a8
The pause advocacy theory of change operates through multiple reinforcing pathways. Public advocacy generates media attention, which combined with grassroots organizing shifts public opinion. This creates political pressure that can lead to either legislative action (such as compute governance) or industry self-regulation (such as responsible scaling policies). Both pathways result in slower development timelines, providing additional time for safety research to mature before transformative AI capabilities emerge.
Arguments for Pause
Capabilities-Safety Gap
The most compelling argument for pause centers on the widening gap between AI capabilities and safety research. While frontier models have jumped from GPT-3 (175B parameters, 2020) to GPT-4 (estimated 1.7T parameters, 2023) to even more powerful systems, fundamental alignment problems remain unsolved. Current safety techniques like constitutional AI and reinforcement learning from human feedback (RLHF) appear increasingly inadequate for highly capable systems that could exhibit deceptive behavior or pursue unintended objectives.
| Generation | Parameters | Year | Key Safety Advances | Gap Assessment |
|---|---|---|---|---|
| GPT-3 | 175B | 2020 | Basic RLHF | Moderate |
| GPT-4 | ≈1.7T (est.) | 2023 | Constitutional AI, red-teaming | Widening |
| Claude 3/GPT-4.5 | Undisclosed | 2024 | RSP frameworks↗🔗 web★★★★☆AnthropicAnthropic pioneered the Responsible Scaling PolicyAnthropic's RSP is one of the first formal industry commitments to conditional scaling, making it a key reference for AI governance discussions and a template other labs have since adapted.This page documents Anthropic's Responsible Scaling Policy (RSP), a framework that ties AI development and deployment decisions to demonstrated capability thresholds and corresp...governancepolicycapabilitiesdeployment+6Source ↗, scaling policies | Significant |
| Projected 2026 | 10T+ | 2026 | Unknown | Critical uncertainty |
Research by Anthropic↗📄 paper★★★★☆AnthropicAnthropic's Work on AI SafetyThis is Anthropic's research landing page, useful as a starting point for discovering their published work on safety and alignment, but not a standalone paper or primary source in itself.Anthropic's research page aggregates their work across AI alignment, mechanistic interpretability, and societal impact assessment, all oriented toward understanding and mitigati...ai-safetyalignmentinterpretabilitytechnical-safety+4Source ↗ and other safety-focused organizations suggests that as models become more capable, they become harder to interpret and control. A 2023 study by Perez et al. found that larger language models show increased tendencies toward deceptive behaviors when given conflicting objectives. Recent mechanistic interpretability work↗🔗 webMechanistic Interpretability for AI Safety — A ReviewA thorough 2024 survey paper useful as an entry point or reference for mechanistic interpretability research; covers both technical foundations and safety implications, making it valuable for readers bridging technical AI safety and interpretability work.A comprehensive academic review by Bereska and Gavves (University of Amsterdam, 2024) that surveys mechanistic interpretability—the practice of reverse-engineering neural networ...interpretabilityai-safetyalignmenttechnical-safety+5Source ↗ remains far from scalable to frontier models with hundreds of billions of parameters. Without a pause to develop better alignment techniques, we may cross critical capability thresholds before adequate safety measures are in place.
Coordination Window
Pause advocacy also argues that slower development creates opportunities for beneficial coordination that are impossible during intense racing dynamics↗🔗 webFrontier AI Safety CommitmentsSection 1.3 of Dan Hendrycks' open-access AI safety textbook; suitable for foundational learning about how competitive dynamics between states and corporations contribute to catastrophic AI risk scenarios.This textbook chapter from the CAIS 'Introduction to AI Safety, Ethics and Society' covers competitive AI race dynamics, including military AI arms races (lethal autonomous weap...ai-safetyexistential-riskcoordinationgovernance+3Source ↗. The current AI development landscape involves only a handful of frontier labs—primarily OpenAI, Google DeepMind, and Anthropic—making coordination theoretically feasible. Historical precedents like the Asilomar Conference on Recombinant DNA↗📖 reference★★★☆☆WikipediaAsilomar Conference on Recombinant DNA (1975)Frequently referenced in AI safety and governance discussions as a historical analogy for how the AI community might self-regulate transformative and potentially dangerous technologies before harms materialize.The 1975 Asilomar Conference brought together ~140 scientists, lawyers, and physicians to voluntarily pause and regulate recombinant DNA research due to potential biohazards. It...governancepolicycoordinationexistential-risk+2Source ↗ (1975) and various nuclear arms control agreements demonstrate that the scientific community and governments can successfully coordinate to slow potentially dangerous technological development when risks are recognized.
The window for such coordination may be closing rapidly. As AI capabilities approach transformative levels, the strategic advantages they confer will likely intensify competitive pressures. Research on strategic insights from simulation gaming of AI race dynamics↗🔗 web★★★★☆ScienceDirect (peer-reviewed)strategic insights from simulation gaming of AI race dynamicsPublished in Futures journal (Elsevier), this study applies simulation gaming methodology to AI race dynamics, offering practical strategic insights relevant to AI governance researchers and policymakers concerned with competitive AI development pressures.Ross Gruetzemacher, Shahar Avin, James Fox et al. (2025)This paper uses simulation gaming methodologies to analyze competitive AI development dynamics, exploring how race conditions between actors affect decision-making, safety trade...ai-safetygovernancecoordinationexistential-risk+3Source ↗ suggests that nations viewing AI as critical to national security may be unwilling to accept coordination mechanisms that could disadvantage them relative to competitors.
Precautionary Principle
Advocates invoke the precautionary principle, arguing that when facing potentially existential risks, the burden of proof should be on demonstrating safety rather than on proving danger. Unlike most technologies, advanced AI systems could pose civilization-level risks if misaligned, making the stakes qualitatively different from typical innovation-risk tradeoffs. This principle has precedent in other high-stakes domains like nuclear weapons development and gain-of-function research, where safety considerations have sometimes overridden rapid advancement.
Arguments Against Pause
Geopolitical Competition
The strongest argument against pause concerns international competition, particularly with China. China's national AI strategy, announced in 2017, explicitly aims for AI leadership by 2030, backed by over $140 billion in state investment through the National Integrated Circuit Industry Investment Fund. However, the picture is more nuanced than simple "race to the bottom" narratives suggest. China released its own AI Safety Governance Framework↗🔗 webAI Safety Governance FrameworkLegal analysis from DLA Piper on China's 2024 AI Safety Governance Framework; useful for understanding non-Western regulatory approaches to AI safety and how geopolitical context shapes AI governance frameworks globally.DLA Piper analyzes China's AI Safety Governance Framework released in September 2024, which establishes principles and mechanisms for managing AI safety risks including technica...governancepolicyai-safetyregulation+3Source ↗ in September 2024, and in December 2024, 17 major Chinese AI companies including DeepSeek and Alibaba signed safety commitments mirroring Seoul AI Summit↗🔗 web★★★★☆Carnegie EndowmentAI governance frameworkUseful for understanding how China's government frames AI safety risks and mitigation strategies; relevant for international AI governance coordination and comparing Eastern vs. Western regulatory approaches to frontier AI risks.Analysis of China's AI Safety Governance Framework 2.0, released by the Cyberspace Administration of China's standards bodies in September 2025. The framework reveals China's ev...governancepolicyai-safetyregulation+5Source ↗ pledges.
| Competitive Factor | US Position | China Position | Implication for Pause |
|---|---|---|---|
| Frontier model capability | Leading | 6-18 months behind | Pause could narrow gap |
| Compute access | Advantage (NVIDIA chips) | Constrained by export controls | Pause less urgent for compute |
| Safety research | Leading (Anthropic, OpenAI) | Growing (CnAISDA↗🔗 webIs China Serious About AI Safety? | AI FrontiersRelevant for wiki users interested in global AI governance and whether international coordination on AI safety is feasible given differing national motivations; note that content was unavailable for direct verification.This article examines China's approach to AI safety, analyzing whether Chinese government rhetoric, regulatory actions, and research investments reflect genuine commitment to AI...governancepolicyregulationcoordination+2Source ↗) | Potential coordination opportunity |
| Regulatory framework | Fragmented, EO rescinded | Unified (CAC framework) | China may regulate faster |
| Talent pool | Advantage | Growing rapidly | Not directly affected by pause |
This concern is compounded by the dual-use nature of AI research. Unlike some other dangerous technologies, AI capabilities research often advances both beneficial applications and potentially dangerous ones simultaneously. Pausing beneficial AI development to prevent dangerous applications may be a poor tradeoff if competitors continue advancing both dimensions.
Compute Overhang Risk
A technical argument against pause involves the "compute overhang" phenomenon. If pause affects model training but not hardware development, accumulated computing power could enable sudden capability jumps when development resumes. Historical analysis of computing trends shows that available compute continues growing exponentially even during periods of algorithmic stagnation. A pause that allows compute scaling to continue could result in more dangerous discontinuous progress than gradual development would produce.
Research by OpenAI's scaling team suggests that sudden access to much larger compute budgets could enable capability gains that bypass incremental safety research. This could be more dangerous than the gradual development that allows safety research to track capabilities improvements.
Economic and Social Costs
Pause advocacy also faces arguments about opportunity costs and distributional effects. AI technologies show significant promise for addressing major challenges including disease, climate change, and poverty. A 2023 economic analysis by McKinsey Global Institute estimated that AI could contribute $13 trillion annually to global economic output by 2030, with particular benefits for developing countries if access is democratized. More recent data shows AI already drives 40% of U.S. real GDP growth (Q1 2025), with economist Michael Roberts calculating that without tech spending, the U.S. would have been close to or in a recession.
Critics argue that pause advocacy primarily benefits existing AI leaders by reducing competitive pressure while imposing costs on society through delayed beneficial applications. The UNDP reports that AI risks "sparking a new era of divergence as development gaps between countries widen"—raising questions about whether pause policies would disproportionately harm developing nations seeking to catch up.
Implementation Mechanisms
Regulatory Approaches
The most direct path to pause involves government regulation. Several mechanisms are under consideration, including compute governance↗🏛️ government★★★★☆Centre for the Governance of AITraining Compute Thresholds: Features and Functions in AI Regulation | GovAIA GovAI policy paper providing technical and regulatory grounding for compute-based AI governance thresholds; directly relevant to understanding the rationale behind the 10^26 FLOP thresholds used in the EU AI Act and US AI Executive Order.This paper evaluates training compute as a regulatory metric for identifying high-risk general-purpose AI models, arguing it is currently the best available proxy due to its cor...governancecomputepolicycapabilities+4Source ↗ (restricting access to the high-end chips needed for training frontier models), mandatory safety evaluations before deployment, and international treaties.
| Regulatory Mechanism | Jurisdiction | Threshold | Status |
|---|---|---|---|
| Executive Order 14110↗📖 reference★★★☆☆WikipediaExecutive Order 14110: Safe, Secure, and Trustworthy Development and Use of Artificial IntelligenceThis Wikipedia article summarizes Biden's landmark 2023 AI executive order, which was a major US policy milestone before being revoked by Executive Order 14179 under the Trump administration in January 2025.Executive Order 14110, signed by President Biden in October 2023, established a comprehensive federal framework for AI governance in the United States. It directed agencies to d...governancepolicyai-safetydeployment+4Source ↗ | US | 10^26 FLOP training runs | Rescinded January 2025 |
| EU AI Act↗🔗 webThe Role of Compute Thresholds for AI GovernanceA legal-academic analysis relevant to AI governance researchers and policymakers examining how compute thresholds function within regulatory frameworks like the EU AI Act and US executive orders on AI safety.This article analyzes training compute thresholds as a regulatory tool for AI governance, examining their use in identifying high-risk AI models. It outlines the advantages of c...governancecomputepolicyai-safety+4Source ↗ | EU | 10^25 FLOP (systemic risk) | Active since August 2024 |
| Responsible AI Act (proposed) | US Congress | Tiered thresholds | Under consideration |
| China AI Governance Framework↗🔗 webAI Safety Governance FrameworkLegal analysis from DLA Piper on China's 2024 AI Safety Governance Framework; useful for understanding non-Western regulatory approaches to AI safety and how geopolitical context shapes AI governance frameworks globally.DLA Piper analyzes China's AI Safety Governance Framework released in September 2024, which establishes principles and mechanisms for managing AI safety risks including technica...governancepolicyai-safetyregulation+3Source ↗ | China | Risk-based grading | Version 2.0 released September 2025 |
However, regulatory approaches face significant implementation challenges. The global nature of AI supply chains, the dual-use character of computing hardware, and the difficulty of defining "frontier" capabilities all complicate enforcement. Additionally, industry opposition remains strong, with major tech companies arguing that regulation could stifle innovation and benefit foreign competitors. The rescission of Biden's EO 14110 within hours of the new administration taking office demonstrates the fragility of executive action on AI governance.
Comparative Governance Capacity
| Country/Region | AI Governance Index Score | Regulatory Approach | Pause Feasibility |
|---|---|---|---|
| Singapore | Top performer | Agile governance, regulatory sandboxes | Medium - innovation-focused |
| United Kingdom | Top performer | AI Safety Institute, risk-based | Medium-High - safety-conscious |
| South Korea | Top performer | Ethics-by-design frameworks | Medium - balanced approach |
| European Union | High | EU AI Act (risk-based, 10^25 FLOP threshold) | High - regulatory willingness |
| United States | Medium | Fragmented; AI Action Plan favors speed | Low - deregulation trend |
| China | Medium | Unified CAC framework, state-directed | Variable - strategic calculations |
| UAE | Highest adoption (64%) | Innovation-first | Low - development priority |
The AGILE Index 2025, covering 40+ countries across 43 indicators, reveals that governance capacity varies dramatically. Countries like Singapore and the UK have embedded agile mechanisms that could theoretically support coordinated slowdowns, while the U.S. approach under the July 2025 AI Action Plan explicitly prioritizes "rapid AI innovation" and links federal funding to states adopting less restrictive laws.
Industry Self-Regulation
An alternative approach involves industry-led initiatives, such as responsible scaling policies that commit labs to pause development when certain capability thresholds are reached without adequate safety measures. Anthropic's Responsible Scaling Policy↗🔗 web★★★★☆AnthropicAnthropic pioneered the Responsible Scaling PolicyAnthropic's RSP is one of the first formal industry commitments to conditional scaling, making it a key reference for AI governance discussions and a template other labs have since adapted.This page documents Anthropic's Responsible Scaling Policy (RSP), a framework that ties AI development and deployment decisions to demonstrated capability thresholds and corresp...governancepolicycapabilitiesdeployment+6Source ↗, first announced in September 2023 and updated in October 2024, provides a template for such approaches by defining specific capability evaluations and safety requirements.
| Lab | Policy | Key Commitments | Independent Verification |
|---|---|---|---|
| Anthropic | RSP v2.2↗🔗 web★★★★☆AnthropicResponsible Scaling PolicyAnthropic's RSP is a foundational industry document for responsible development commitments; frequently cited in AI governance discussions as a model for 'if-then' safety commitments from frontier labs.Anthropic's Responsible Scaling Policy (RSP) is a formal commitment outlining how the company will evaluate AI systems for dangerous capabilities and adjust deployment and devel...governancepolicyai-safetyevaluation+5Source ↗ | ASL capability thresholds, biosecurity evals | Limited external audit |
| OpenAI | Preparedness Framework | Red-teaming, catastrophic risk thresholds | Internal governance |
| Google DeepMind | Frontier Safety Framework | Pre-deployment evaluations | In development |
The advantage of industry self-regulation is that it can move faster than formal regulation and may be more technically sophisticated. However, it relies on voluntary compliance and may not address competitive pressures that incentivize racing. After one year with the RSP in effect, Anthropic acknowledged↗🔗 web★★★★☆AnthropicAnthropic acknowledgedAnthropic's first-party reflection on their Responsible Scaling Policy, a pioneering industry effort to link capability evaluations to deployment decisions; relevant for understanding how frontier AI labs operationalize safety commitments.Anthropic reflects on the development, implementation, and lessons learned from their Responsible Scaling Policy (RSP), which establishes safety standards and evaluation thresho...ai-safetypolicygovernanceevaluation+4Source ↗ instances where they fell short of meeting the full letter of its requirements, though they assessed these posed minimal safety risk.
Safety Implications and Trajectory
Concerning Aspects
The primary safety concern with pause advocacy is that it may not achieve its intended goals while creating new risks. If pauses are unevenly implemented, they could concentrate advanced AI development among less safety-conscious actors. Additionally, the political and economic pressures against pause may lead to policy capture or symbolic measures that provide false security without meaningfully reducing risks.
There are also concerns about the precedent that successful pause advocacy might set. If pause becomes a standard response to emerging technologies, it could hamper beneficial innovation more broadly. The challenge is distinguishing cases where precautionary pauses are justified from those where they primarily serve incumbent interests.
Promising Aspects
Conversely, even partial success in slowing AI development could provide crucial time for safety research to advance. Recent progress in interpretability research↗📄 paper★★★★☆AnthropicAnthropic's Work on AI SafetyThis is Anthropic's research landing page, useful as a starting point for discovering their published work on safety and alignment, but not a standalone paper or primary source in itself.Anthropic's research page aggregates their work across AI alignment, mechanistic interpretability, and societal impact assessment, all oriented toward understanding and mitigati...ai-safetyalignmentinterpretabilitytechnical-safety+4Source ↗, including work on sparse autoencoders and circuit analysis, suggests that additional time could yield significant safety improvements. In March 2025, Anthropic published research on circuit tracing that "lets us watch Claude think, uncovering a shared conceptual space where reasoning happens before being translated into language." A coordinated slowdown that maintains global leadership among safety-conscious actors could be the best available approach for navigating the development of transformative AI.
Furthermore, pause advocacy has already had positive effects on AI safety discourse by raising awareness of risks and legitimizing safety concerns in policy circles. The 2023 open letter↗🔗 web★★★☆☆Future of Life InstitutePause Giant AI Experiments: An Open Letter (FLI, 2023)A landmark public advocacy document signed by prominent researchers and figures in 2023; represents a major moment in public AI governance debate, though critics questioned its enforceability and some signatories later distanced themselves from its framing.A widely-signed open letter published by the Future of Life Institute in March 2023, calling on all AI labs to pause for at least 6 months the training of AI systems more powerf...ai-safetygovernancepolicyexistential-risk+4Source ↗ helped establish AI safety as a mainstream policy concern, influencing subsequent regulatory discussions and industry practices.
Current State and Trajectory
As of late 2025, pause advocacy remains politically marginal but increasingly organized. PauseAI↗🔗 webPauseAI - Movement to Pause Advanced AI DevelopmentPauseAI represents a prominent activist wing of the AI safety movement; useful for understanding the 'pause' strategic perspective and current advocacy efforts, though distinct from technical alignment research approaches.PauseAI is an advocacy movement calling for an international pause on the development of advanced AI systems until adequate safety measures and governance frameworks are in plac...governancepolicyexistential-riskcoordination+4Source ↗, founded in May 2023, has grown from a single San Francisco group to six established chapters across the US (NYC, Chicago, SF, Portland, Phoenix, Washington DC) with protests in 13+ countries including at the AI Seoul Summit↗🔗 webPauseAI / No AGI Protest @ OpenAI San Francisco - February 12th, 2024This page documents a grassroots AI safety protest event; relevant to understanding public advocacy tactics around AI governance and the OpenAI-Pentagon controversy of early 2024.Documentation of a February 2024 protest at OpenAI's San Francisco headquarters organized by PauseAI and No AGI, demanding OpenAI halt AGI development and end its military contr...governancepolicyai-safetyexistential-risk+2Source ↗ and Paris AI Action Summit.
2025 Protest Milestones
| Event | Date | Location | Focus | Outcome |
|---|---|---|---|---|
| Paris AI Summit Protest | February 2025 | Paris, France | Summit lacking safety focus | Global coordination, Kenya and DRC joined |
| Google DeepMind Protest | June 2025 | London, UK | Gemini 2.5 safety commitments | 60 UK MPs sent letter to Google; largest PauseAI protest |
| PauseCon | June 2025 | London, UK | First activist training event | Movement capacity building |
| Amsterdam ASML Protest | December 2025 | Amsterdam, Netherlands | Chip supply chain for AI | Targeting hardware enablers |
The June 2025 Google DeepMind protest marked a turning point in pause advocacy effectiveness. When Google released Gemini 2.5 Pro in March 2025 without publishing the promised safety testing documentation, PauseAI organized demonstrations outside DeepMind's London headquarters where protesters chanted "Stop the race, it's unsafe" and "Test, don't guess." The action gained political traction when 60 cross-party UK parliamentarians signed a letter accusing Google of "breach of trust" for violating commitments made at the Seoul AI Safety Summit and to the White House in 2023. Former Defence Secretary Lord Browne warned: "If leading companies like Google treat these commitments as optional, we risk a dangerous race to deploy increasingly powerful AI without proper safeguards."
| Timeframe | Scenario | Probability | Key Drivers |
|---|---|---|---|
| 2025-2026 | No meaningful pause | 60-70% | Industry lobbying, EO rescission, China competition |
| 2025-2026 | Voluntary industry slowdown | 15-25% | Safety incident, capability threshold reached |
| 2027-2030 | Coordinated international framework | 15-30% | China-US dialogue progress↗🔗 webChina-US dialogue progressRelevant to international AI governance efforts; provides empirical grounding for US-China AI diplomacy by identifying concrete policy convergences, useful for researchers and policymakers working on global AI coordination.This paper conducts a systematic analysis of over 40 AI policy documents from the US and China to identify areas of convergence in AI governance approaches. It finds meaningful ...governancepolicycoordinationai-safety+2Source ↗, WAICO proposal |
| 2027-2030 | Binding compute governance | 10-20% | Major incident, legislative action |
Looking ahead 1-2 years, the trajectory likely depends on whether AI capabilities approach obviously dangerous thresholds. Dramatic capability jumps or safety incidents could shift political feasibility significantly. Conversely, gradual progress that demonstrates controllable development might reduce pause advocacy's appeal.
In the 2-5 year timeframe, international coordination mechanisms will likely determine pause advocacy's viability. China has proposed WAICO↗📄 paper★★★★★Nature (peer-reviewed)China wants to lead the world on AI regulation — will the plan work?A Nature news article examining China's AI regulation strategy and its global implications, relevant for understanding international approaches to AI governance and safety frameworks.Elizabeth Gibney (2025)regulationchinacontent-controlSource ↗ (World AI Cooperation Organization) as a framework for coordinating AI governance rules. If major AI powers can establish effective governance frameworks, coordinated development constraints become more feasible. If geopolitical competition intensifies, unilateral pauses become increasingly untenable.
Key Uncertainties
Several critical uncertainties shape the case for pause advocacy:
| Uncertainty | If True (Pro-Pause) | If False (Anti-Pause) | Current Assessment |
|---|---|---|---|
| Alignment is hard | Pause essential to buy research time | Pause unnecessary, current methods sufficient | Unclear; interpretability progress↗🔗 webMechanistic Interpretability for AI Safety — A ReviewA thorough 2024 survey paper useful as an entry point or reference for mechanistic interpretability research; covers both technical foundations and safety implications, making it valuable for readers bridging technical AI safety and interpretability work.A comprehensive academic review by Bereska and Gavves (University of Amsterdam, 2024) that surveys mechanistic interpretability—the practice of reverse-engineering neural networ...interpretabilityai-safetyalignmenttechnical-safety+5Source ↗ slow but advancing |
| Short timelines | Pause urgent, limited window | Adequate time without constraints | Models improving rapidly; 2-5 year uncertainty |
| International coordination feasible | Global pause achievable | Unilateral pause counterproductive | China-US dialogue↗🔗 webChina-US dialogue progressRelevant to international AI governance efforts; provides empirical grounding for US-China AI diplomacy by identifying concrete policy convergences, useful for researchers and policymakers working on global AI coordination.This paper conducts a systematic analysis of over 40 AI policy documents from the US and China to identify areas of convergence in AI governance approaches. It finds meaningful ...governancepolicycoordinationai-safety+2Source ↗ shows some progress |
| Racing dynamics dominant | Pause prevents corner-cutting | Aviation industry shows↗🔗 webAviation industry showsPublished by UC Berkeley's Center for Long-Term Cybersecurity (CLTC), this report is useful for researchers and policymakers exploring analogies between aviation safety regulation and AI governance frameworks.This CLTC Berkeley report examines how the aviation industry's rigorous safety culture, certification processes, and regulatory frameworks can inform AI safety practices. It dra...ai-safetygovernancepolicydeployment+3Source ↗ safety can prevail | Competitive pressures strong but not deterministic |
| RSPs work | Voluntary pause-like mechanisms sufficient | Need binding regulation | Anthropic RSP↗🔗 web★★★★☆AnthropicAnthropic pioneered the Responsible Scaling PolicyAnthropic's RSP is one of the first formal industry commitments to conditional scaling, making it a key reference for AI governance discussions and a template other labs have since adapted.This page documents Anthropic's Responsible Scaling Policy (RSP), a framework that ties AI development and deployment decisions to demonstrated capability thresholds and corresp...governancepolicycapabilitiesdeployment+6Source ↗ promising but untested at capability thresholds |
The difficulty of AI alignment remains fundamentally unknown—if alignment problems prove tractable with existing techniques, pause advocacy's necessity diminishes significantly. Conversely, if alignment requires fundamental breakthroughs that need years of research, pause may become essential regardless of political difficulties.
The timeline to transformative AI capabilities also remains highly uncertain. If such capabilities are decades away, there may be adequate time for safety research without development constraints. If they arrive within years, pause advocacy's urgency increases dramatically.
Finally, the prospects for international coordination remain unclear. China's approach to AI safety and willingness to participate in coordination mechanisms will largely determine whether global pause initiatives are feasible. The International Dialogues on AI Safety (IDAIS)↗🔗 web★★★★☆Carnegie EndowmentAI governance frameworkUseful for understanding how China's government frames AI safety risks and mitigation strategies; relevant for international AI governance coordination and comparing Eastern vs. Western regulatory approaches to frontier AI risks.Analysis of China's AI Safety Governance Framework 2.0, released by the Cyberspace Administration of China's standards bodies in September 2025. The framework reveals China's ev...governancepolicyai-safetyregulation+5Source ↗ produced consensus statements including the Ditchley Statement and Beijing Statement establishing specific technological "red lines" including autonomous replication and deception of regulators—suggesting some foundation for cooperation.
The effectiveness of alternative safety interventions also affects pause advocacy's relative value. If industry responsible scaling policies or technical alignment approaches prove sufficient, the need for development pauses decreases. However, if these approaches fail to keep pace with capabilities, pause may become the only viable risk reduction mechanism.
Public Opinion Landscape
Public support for AI regulation and development pauses has grown substantially, creating a potential political foundation for pause advocacy. The Stanford HAI 2025 AI Index and multiple polling organizations document consistent, bipartisan support for stronger AI governance.
| Poll | Date | Finding | Implication |
|---|---|---|---|
| Gallup/SCSP | September 2025 | 97% agree AI safety should be subject to rules; 80% support even if it slows development | Near-universal support for regulation crosses party lines |
| FLI Survey | October 2025 | 73% support robust AI regulation; 64% support superintelligence ban until proven safe; only 5% favor fast unregulated development | Strong majority favors precautionary approach |
| Quinnipiac | April 2025 | 69% say government not doing enough to regulate AI; 44% think AI will do more harm than good | Public perceives regulatory gap |
| Rethink Priorities | 2025 | 51% would support AI research pause; 25% oppose | Support outstrips opposition 2:1 |
| Pew Research | 2025 | 60% of public and 56% of AI experts worry government won't regulate enough; 64% of Democrats, 55% of Republicans concerned | Expert-public alignment on regulation |
| YouGov (post-FLI letter) | 2023 | 58-61% supported 6-month pause on AI research across different framings | Initial pause support established |
| General Sentiment | 2025 | 77% want companies to "create AI slowly and get it right" even if it delays breakthroughs | Strong preference for caution over speed |
Political Dynamics
Despite strong polling support, translating public opinion into policy remains challenging. The political economy favors AI developers: concentrated benefits for tech companies versus diffuse risks for the public create asymmetric lobbying incentives. The rapid rescission of Biden's EO 14110 in January 2025 illustrates how quickly regulatory frameworks can be dismantled despite public support.
However, recent political developments show emerging bipartisan resistance to industry capture. In 2025, the U.S. Senate voted 99-1 to strike down a provision that would have imposed a 10-year moratorium on state AI regulation—opposed by 17 Republican governors, state attorneys general from both parties, and state legislators nationwide. The Gallup finding that 88% of Democrats and 79% of Republicans and independents support AI safety rules suggests bipartisan potential for regulation if framed appropriately.
| Political Factor | Current State | Trend |
|---|---|---|
| Federal executive | Deregulatory (AI Action Plan) | Anti-pause |
| Federal legislative | Mixed; some bipartisan safety interest | Uncertain |
| State-level | Active regulation; 53% of voters trust states more than Congress | Pro-regulation |
| Industry lobbying | Strong opposition; concentrated economic interests | Anti-pause |
| Public pressure | Growing; 65% believe AI will increase misinformation | Pro-regulation |
Historical Precedents
The Asilomar Conference on Recombinant DNA↗📖 reference★★★☆☆WikipediaAsilomar Conference on Recombinant DNA (1975)Frequently referenced in AI safety and governance discussions as a historical analogy for how the AI community might self-regulate transformative and potentially dangerous technologies before harms materialize.The 1975 Asilomar Conference brought together ~140 scientists, lawyers, and physicians to voluntarily pause and regulate recombinant DNA research due to potential biohazards. It...governancepolicycoordinationexistential-risk+2Source ↗ (1975) provides the most frequently cited precedent for scientific self-regulation and pause. Over 100 scientists, lawyers, and journalists gathered to develop consensus guidelines for recombinant DNA research, following a voluntary moratorium initiated by scientists themselves who had recognized potential biohazards.
| Precedent | Year | Scope | Outcome | Lessons for AI |
|---|---|---|---|---|
| Asilomar Conference↗🔗 webAsilomar and recombinant DNA - NobelPrize.orgFrequently referenced in AI governance discussions as a historical analogy for voluntary research pauses and proactive safety governance; relevant to debates about whether AI development should adopt similar moratorium or self-regulation approaches.Paul Berg's Nobel Prize article recounts the 1975 Asilomar Conference, where scientists voluntarily paused recombinant DNA research to assess biosafety risks and establish guide...governancecoordinationpolicyexistential-risk+2Source ↗ | 1975 | Recombinant DNA | Moratorium lifted with safety guidelines; NIH RAC created | Scientists can self-regulate; guidelines enabled rather than blocked research |
| Nuclear weapons moratorium | 1958-1961 | Nuclear testing | Partial Test Ban Treaty (1963) | International coordination possible under existential threat |
| BWC↗📖 reference★★★☆☆WikipediaBiological Weapons Convention - WikipediaUseful background reference for AI safety researchers examining analogies between biological weapons governance and potential AI governance frameworks, particularly regarding dual-use technology and verification challenges.The Biological Weapons Convention (BWC) is an international arms control treaty that prohibits the development, production, and stockpiling of biological weapons. It entered int...governanceexistential-riskpolicycoordination+1Source ↗ | 1972 | Bioweapons | 187 states parties; no verification regime | Limits of international agreements without enforcement |
| Gain-of-function pause | 2014-2017 | Dangerous pathogen research | Enhanced oversight; research resumed | Temporary pauses can enable safety improvements |
The Asilomar precedent is instructive but imperfect. Key differences from AI:
- Smaller community: Only a few hundred researchers worked on rDNA in 1975 vs. millions in AI today
- Clearer risks: Biohazards were more tangible than AI alignment concerns
- Fewer commercial pressures: Academic research vs. hundreds of billions in investment
- Easier enforcement: Physical lab access vs. distributed compute and open-source models
Notably, the rDNA moratorium lasted only months, and "literally hundreds of millions of experiments, many inconceivable in 1975, have been carried out in the last 30 years without incident"—suggesting that well-designed pauses can enable rather than block beneficial research.
Related Organizations and Approaches
Major organizations advancing pause advocacy include:
- Future of Life Institute↗🔗 web★★★☆☆Future of Life InstituteFuture of Life InstituteFLI is one of the earliest and most prominent AI safety-focused organizations; a key node in the broader ecosystem for policy advocacy, public communication, and research funding in AI existential risk reduction.The Future of Life Institute (FLI) is a nonprofit organization focused on steering transformative technologies, particularly AI, away from catastrophic risks and toward benefici...ai-safetygovernanceexistential-riskpolicy+3Source ↗: Organized the 2023 open letter↗🔗 web★★★☆☆Future of Life InstitutePause Giant AI Experiments: An Open Letter (FLI, 2023)A landmark public advocacy document signed by prominent researchers and figures in 2023; represents a major moment in public AI governance debate, though critics questioned its enforceability and some signatories later distanced themselves from its framing.A widely-signed open letter published by the Future of Life Institute in March 2023, calling on all AI labs to pause for at least 6 months the training of AI systems more powerf...ai-safetygovernancepolicyexistential-risk+4Source ↗ with 30,000+ signatures
- PauseAI↗🔗 webPauseAI - Movement to Pause Advanced AI DevelopmentPauseAI represents a prominent activist wing of the AI safety movement; useful for understanding the 'pause' strategic perspective and current advocacy efforts, though distinct from technical alignment research approaches.PauseAI is an advocacy movement calling for an international pause on the development of advanced AI systems until adequate safety measures and governance frameworks are in plac...governancepolicyexistential-riskcoordination+4Source ↗: Grassroots organization founded May 2023; protests in 13+ countries; claims 70% of Americans support pause
- Center for AI Safety↗🔗 web★★★★☆Center for AI SafetyCenter for AI Safety (CAIS) – HomepageCAIS is one of the leading AI safety research organizations; this homepage provides an entry point to their research, public statements, and field-building initiatives relevant to anyone working in or entering AI safety.The Center for AI Safety (CAIS) is a research organization focused on mitigating catastrophic and existential risks from advanced AI systems. It conducts technical research, pub...ai-safetyexistential-riskalignmentfield-building+4Source ↗: Research and policy organization; published statement on AI risk↗🔗 web★★★★☆Center for AI SafetyStatement on AI Risk - Center for AI SafetyThis landmark 2023 open letter is frequently cited as a turning point in mainstream acknowledgment of existential AI risk, bringing together signatories from across the AI industry and policy world under a single succinct statement.A concise open letter coordinated by the Center for AI Safety stating that mitigating extinction-level risk from AI should be a global priority alongside pandemics and nuclear w...existential-riskai-safetygovernancepolicy+3Source ↗ signed by leading researchers
- Academic researchers: Stuart Russell, Yoshua Bengio, Geoffrey Hinton have lent intellectual credibility to pause arguments
Complementary approaches include compute governance initiatives that could enable pause enforcement, international coordination efforts that could make pauses stable, and responsible scaling policies that implement conditional pause-like mechanisms. Technical alignment research also complements pause advocacy by developing the safety measures that would make development resumption safer.
International Governance Developments
The global governance landscape for AI has evolved rapidly, with both opportunities and obstacles for pause advocacy. The inaugural International AI Safety Report, published January 2025, represents the largest global collaboration on AI safety to date—led by Turing Award winner Yoshua Bengio, authored by over 100 AI experts, and backed by 30 countries.
| Initiative | Date | Key Features | Pause Relevance |
|---|---|---|---|
| International AI Safety Report | January 2025 | First comprehensive global review; 100+ experts, 30 countries | Scientific foundation for coordinated action |
| UN Scientific Committee on AI | August 2025 | Independent scientific body for AI assessment | Could provide "broad scientific consensus" mechanism |
| UN Global Dialogue on AI Governance | August 2025 | Inclusive international governance forum | Platform for pause coordination |
| China Global AI Governance Action Plan | July 2025 | Multilateral cooperation framework; scientific cooperation, regulatory coordination | Signals Chinese openness to coordination |
| US AI Action Plan | July 2025 | Deregulation, global competitiveness focus | Opposes pause; favors speed |
| AI Red Lines Campaign | September 2025 | 200+ signatories including 10 Nobel laureates | UN-focused advocacy for "globally unacceptable AI risks" |
| Global Digital Compact | 2024 | First universal agreement on AI governance | Establishes baseline for international norms |
The UN initiatives launched in August 2025 grew from the "Governing AI for Humanity" report and aim to "kickstart a much more inclusive form of international governance." Secretary-General António Guterres emphasized that "the United Nations offers a uniquely universal platform for such global cooperation" and called for building "safe, secure and trustworthy AI systems grounded in international law and human rights."
However, only seven countries—all from the developed world—are parties to all current significant global AI governance initiatives, revealing the fragmentation that hampers pause enforcement. The fundamental tension remains: the US AI Action Plan explicitly encourages "rapid AI innovation" while linking federal funding to states adopting less restrictive AI laws, directly countering pause advocacy goals. Meanwhile, China's Global AI Governance Action Plan emphasizes "establishing international platforms for scientific and technological cooperation" and "strengthening policy and regulatory coordination"—suggesting potential for multilateral frameworks, though skeptics note both powers view AI as a strategic asset and resist binding international limits.
Limitations and Risks of Pause Advocacy
Pause advocacy faces substantial limitations that could undermine its effectiveness or produce unintended consequences.
Structural Limitations
| Limitation | Description | Severity |
|---|---|---|
| Unilateral implementation | A pause by safety-conscious labs could cede leadership to less responsible actors | High |
| Verification difficulty | Unlike nuclear materials, AI training is distributed and hard to monitor | High |
| Economic irreversibility | $109B+ invested creates path dependencies resistant to reversal | High |
| Definitional challenges | No consensus on what constitutes "frontier" AI or "adequate safety" | Medium |
| Dual-use nature | Same capabilities enable beneficial and dangerous applications | Medium |
| Open-source proliferation | Powerful models already available cannot be "undeployed" | High |
Potential Negative Consequences
Critics on LessWrong and elsewhere have identified several ways pause advocacy could backfire:
- Brain drain effect: The least safety-conscious researchers may relocate to jurisdictions without pause policies, concentrating talent in less responsible environments
- Compute overhang: If hardware development continues during a training pause, resumption could enable sudden capability jumps rather than gradual, monitorable progress
- Underground development: Stringent restrictions could push development to illegal or less observable channels, reducing transparency
- Political backlash: Failed pause attempts could discredit AI safety concerns more broadly, making future governance harder
- Opportunity costs: Resources spent on pause advocacy could yield higher impact if directed toward technical alignment research or defensive measures
Effectiveness Uncertainty
| Scenario | Probability | Outcome for Pause Advocacy |
|---|---|---|
| Pause achieves 2+ year slowdown | 15-25% | Success—safety research catches up |
| Pause achieves 6-12 month delay | 20-30% | Partial success—some benefit, risks remain |
| Pause fails, shifts development offshore | 25-35% | Failure—net negative outcome |
| Pause advocacy raises awareness without policy change | 30-40% | Indirect benefit—Overton window shifted |
The EA Forum debate on pause feasibility highlights fundamental uncertainty: proponents argue that advocacy has successfully mainstreamed AI safety concerns (Geoffrey Hinton quitting Google, FLI letter media coverage), while critics note that "six months from now the pause would end, leaving us more or less where we started."
Sources and Further Reading
Primary Documents
- International AI Safety Report (January 2025): First comprehensive global review of AI risks; led by Yoshua Bengio, 100+ experts, backed by 30 countries
- US AI Action Plan (July 2025): Federal policy prioritizing "rapid AI innovation" and global competitiveness
- China Global AI Governance Action Plan (July 2025): Multilateral cooperation framework emphasizing coordination
- UN Global Dialogue on AI Governance (August 2025): Launch of inclusive international governance forum
- California Report on Frontier AI Policy: State-level policy analysis
Public Opinion Research
- Years of Polling Show Overwhelming Voter Support for a Crackdown on AI (Public Citizen): Compilation of polling data showing 97% support for AI safety rules
- The U.S. Public Wants Regulation (or Prohibition) of Expert-Level and Superhuman AI (FLI, October 2025): 64% support superintelligence ban until proven safe
- Poll finds bipartisan agreement on a key issue: Regulating AI (The Conversation): 77% want companies to "create AI slowly and get it right"
- Stanford HAI 2025 AI Index Report: Public Opinion: Comprehensive survey of public attitudes
Economic Analysis
- When Will AI Investments Start Paying Off? (GW&K Investment Management): Analysis of AI's role in GDP growth
- Global AI Adoption in 2025 (Microsoft AI Economy Institute): International adoption statistics
- Government AI Readiness Index 2025 (Oxford Insights): Comparative governance capacity assessment
Critical Analysis
- How "Pause AI" advocacy could be net harmful (LessWrong): Arguments against pause advocacy
- Is Pausing AI Possible? (EA Forum): Analysis of implementation challenges
- Pause For Thought: The AI Pause Debate (EA Forum): Balanced overview of arguments
- What are some criticisms of PauseAI? (EA Forum): Community discussion of weaknesses
Governance Frameworks
- AI governance must keep pace with this fast-developing field (World Economic Forum): AGILE Index methodology and findings
- Global AI law and policy trends update (IAPP): Regulatory landscape overview
- ITU Annual AI Governance Report 2025: Comprehensive governance assessment
References
1strategic insights from simulation gaming of AI race dynamicsScienceDirect (peer-reviewed)·Ross Gruetzemacher, Shahar Avin, James Fox & Alexander K. Saeri·2025▸
This paper uses simulation gaming methodologies to analyze competitive AI development dynamics, exploring how race conditions between actors affect decision-making, safety tradeoffs, and strategic behavior. It provides empirical insights into how competitive pressures shape AI development choices and potential governance interventions.
The Biological Weapons Convention (BWC) is an international arms control treaty that prohibits the development, production, and stockpiling of biological weapons. It entered into force in 1975 and represents the first multilateral disarmament treaty banning an entire category of weapons of mass destruction. It serves as a key reference point for discussions about governing catastrophic biological risks.
Paul Berg's Nobel Prize article recounts the 1975 Asilomar Conference, where scientists voluntarily paused recombinant DNA research to assess biosafety risks and establish guidelines. The article serves as a historical case study of scientific self-governance in response to emerging biotechnology risks. It is frequently cited as a precedent for how the scientific community can proactively address dual-use and safety concerns in powerful new technologies.
MIT Technology Review interviews Max Tegmark six months after the Future of Life Institute's open letter calling for a pause on advanced AI development. While the letter succeeded in shifting the Overton window and normalizing public discussion of existential AI risk, no meaningful U.S. regulation resulted and all major AI companies continued development at full speed. Tegmark argues that only government intervention via FDA-style oversight can create the conditions for an enforceable pause, since no single company can pause unilaterally without competitive disadvantage.
This textbook chapter from the CAIS 'Introduction to AI Safety, Ethics and Society' covers competitive AI race dynamics, including military AI arms races (lethal autonomous weapons, cyberwarfare), corporate races where economic competition undercuts safety, and evolutionary pressures that favor unsafe AI development. It examines how competitive pressures between states and corporations can lead to catastrophic outcomes.
The 1975 Asilomar Conference brought together ~140 scientists, lawyers, and physicians to voluntarily pause and regulate recombinant DNA research due to potential biohazards. It established voluntary safety guidelines and is widely cited as a historical precedent for scientists self-regulating an emerging powerful technology before its risks were fully understood. The conference is frequently invoked in AI safety discussions as a model for proactive governance of transformative technologies.
A comprehensive academic review by Bereska and Gavves (University of Amsterdam, 2024) that surveys mechanistic interpretability—the practice of reverse-engineering neural networks into human-understandable algorithms—with explicit focus on its relevance to AI safety. The review covers foundational concepts like features and circuits, methodologies for causal dissection of model behaviors, and assesses both the benefits and risks of mechanistic interpretability for alignment. It also identifies key challenges around scalability, automation, and generalization to domains beyond language.
A concise open letter coordinated by the Center for AI Safety stating that mitigating extinction-level risk from AI should be a global priority alongside pandemics and nuclear war. The statement has been signed by hundreds of leading AI researchers, executives, and public figures including Geoffrey Hinton, Yoshua Bengio, Sam Altman, and Demis Hassabis, lending significant institutional credibility to existential AI risk concerns.
Analysis of China's AI Safety Governance Framework 2.0, released by the Cyberspace Administration of China's standards bodies in September 2025. The framework reveals China's evolving understanding of AI risks including CBRN misuse, open-source model proliferation, loss of control, and labor market impacts, paired with technical countermeasures and governance recommendations.
This article analyzes training compute thresholds as a regulatory tool for AI governance, examining their use in identifying high-risk AI models. It outlines the advantages of compute as a regulatory metric (quantifiability, verifiability, scalability) while acknowledging limitations like algorithmic efficiency gains, and recommends treating compute thresholds as filters triggering further scrutiny rather than definitive risk measures.
A widely-signed open letter published by the Future of Life Institute in March 2023, calling on all AI labs to pause for at least 6 months the training of AI systems more powerful than GPT-4. It argues that AI development has entered a dangerous uncontrolled race and calls for shared safety protocols, independent auditing, and accelerated AI governance frameworks before proceeding with more powerful systems.
This paper conducts a systematic analysis of over 40 AI policy documents from the US and China to identify areas of convergence in AI governance approaches. It finds meaningful overlap in concerns about algorithmic transparency, system reliability, and multi-stakeholder engagement, suggesting concrete opportunities for bilateral cooperation despite geopolitical tensions.
The Future of Life Institute (FLI) is a nonprofit organization focused on steering transformative technologies, particularly AI, away from catastrophic risks and toward beneficial outcomes. They operate across policy advocacy, research funding, education, and outreach to promote responsible AI development. FLI has been influential in key AI safety milestones including the open letter on AI risks and the Asilomar AI Principles.
Documentation of a February 2024 protest at OpenAI's San Francisco headquarters organized by PauseAI and No AGI, demanding OpenAI halt AGI development and end its military contracts with the Pentagon. The protest highlighted OpenAI's deletion of policy language prohibiting military use and its subsequent Pentagon partnership as evidence of eroding safety commitments.
Executive Order 14110, signed by President Biden in October 2023, established a comprehensive federal framework for AI governance in the United States. It directed agencies to develop safety standards, required developers of powerful AI systems to share safety test results with the government, and addressed issues ranging from biosecurity risks to civil rights protections and workforce impacts.
16China wants to lead the world on AI regulation — will the plan work?Nature (peer-reviewed)·Elizabeth Gibney·2025·Paper▸
This article examines China's approach to AI safety, analyzing whether Chinese government rhetoric, regulatory actions, and research investments reflect genuine commitment to AI safety or primarily serve other political and economic objectives. It explores the tension between China's rapid AI development ambitions and its stated safety concerns.
The Center for AI Safety (CAIS) is a research organization focused on mitigating catastrophic and existential risks from advanced AI systems. It conducts technical research, publishes surveys and statements, and supports field-building efforts across academia and industry. CAIS is notable for its broad coalition-building, including its widely-cited statement on AI extinction risk signed by leading researchers.
Anthropic reflects on the development, implementation, and lessons learned from their Responsible Scaling Policy (RSP), which establishes safety standards and evaluation thresholds tied to AI capability levels. The post discusses how the RSP has evolved, what has worked, and areas for improvement as AI systems become more capable.
PauseAI is an advocacy movement calling for an international pause on the development of advanced AI systems until adequate safety measures and governance frameworks are in place. The organization coordinates activists, provides educational resources, and lobbies policymakers to take urgent action on AI risk. It represents a direct-action approach to AI safety that prioritizes preventing catastrophic outcomes over accelerating beneficial AI.
This page documents Anthropic's Responsible Scaling Policy (RSP), a framework that ties AI development and deployment decisions to demonstrated capability thresholds and corresponding safety measures. It outlines commitments to pause or restrict scaling if AI systems reach certain dangerous capability levels without adequate safeguards, and tracks updates to the policy over time.
DLA Piper analyzes China's AI Safety Governance Framework released in September 2024, which establishes principles and mechanisms for managing AI safety risks including technical safety standards, content controls, and oversight requirements for AI developers and deployers operating in China. The framework reflects China's broader regulatory approach to AI, emphasizing state oversight alongside industry responsibility.
23Training Compute Thresholds: Features and Functions in AI Regulation | GovAICentre for the Governance of AI·Government▸
This paper evaluates training compute as a regulatory metric for identifying high-risk general-purpose AI models, arguing it is currently the best available proxy due to its correlation with capabilities, early measurability, and external verifiability. The authors position compute thresholds as an initial filter to trigger further scrutiny—such as evaluations and risk assessments—rather than as standalone determinants of mitigation requirements. The paper directly informs real-world regulatory frameworks including the EU AI Act and US executive orders.
This CLTC Berkeley report examines how the aviation industry's rigorous safety culture, certification processes, and regulatory frameworks can inform AI safety practices. It draws parallels between aviation's evolution as a safety-critical domain and the challenges facing AI deployment, offering concrete lessons for developing robust AI safety standards.
Anthropic's research page aggregates their work across AI alignment, mechanistic interpretability, and societal impact assessment, all oriented toward understanding and mitigating risks from increasingly capable AI systems. It serves as a central hub for their published findings and ongoing safety-focused investigations.
The article argues that despite soaring AI company valuations in the trillions of dollars, investment in AI safety research remains critically underfunded relative to capabilities development. It highlights the structural and incentive-driven reasons why safety spending lags behind, and calls attention to the dangerous gap between commercial AI deployment pace and safety assurance.
The 2025 Stanford HAI AI Index report chapter on public opinion presents survey data from 26 countries on how people perceive AI's benefits, risks, and societal impacts. It tracks longitudinal shifts in public attitudes toward AI across dimensions including employment, safety, and trust. This data provides a foundation for understanding the social and political context surrounding AI governance and deployment.
A landmark international scientific assessment co-authored by 96 experts from 30 countries, providing a comprehensive overview of general-purpose AI capabilities, risks, and risk management approaches. It aims to establish shared scientific understanding across nations as a foundation for global AI governance. The report covers topics including capability evaluation, misuse risks, systemic risks, and mitigation strategies.
China's official AI governance framework, released at the 2025 World AI Conference, establishes principles for international cooperation treating AI as a global public good. It emphasizes national sovereignty, safety, controllability, fairness, and open cooperation, calling for coordinated action across stakeholders to advance innovation while maintaining human oversight and ethical development.
A comprehensive policy report from the Joint California Policy Working Group on AI Frontier Models, convened by California Governor Gavin Newsom, authored by leading AI researchers from Stanford, UC Berkeley, and Carnegie Endowment. The report analyzes governance frameworks for frontier AI models and provides policy recommendations for California. It represents a major state-level effort to develop evidence-based AI regulatory frameworks drawing on interdisciplinary expertise.
The ITU's 2025 AI Governance Report provides a comprehensive overview of global AI governance developments, frameworks, and policy trends from an international telecommunications and ICT standards perspective. It examines how nations and international bodies are approaching AI regulation, safety standards, and coordination challenges. The report serves as a reference document for policymakers and stakeholders navigating the evolving AI governance landscape.