AI Governance and Policy
AI Governance and Policy
Comprehensive analysis of AI governance mechanisms estimating 30-50% probability of meaningful regulation by 2027 and 5-25% x-risk reduction potential through coordinated international approaches. Documents EU AI Act implementation (€400M enforcement budget), RSP adoption across 60-80% of frontier labs, and current investment of $150-300M/year globally with 500-1,000 dedicated professionals.
Quick Assessment
| Dimension | Assessment | Evidence |
|---|---|---|
| Tractability | Medium-High | 30-50% probability of meaningful regulation in major jurisdictions by 2027; EU AI Act enforcement began August 2025 |
| Investment Level | ≈$150-300M/year globally | Government AI safety institutes, think tanks, advocacy organizations; US AISI requested $12.7M FY2025 |
| Field Size | 500-1,000 FTE | Dedicated governance professionals globally; growing 20-30% annually |
| Political Momentum | High | EU AI Act operational; 12 new national AI strategies published in 2024 (3x 2023); G7 Hiroshima Process |
| Industry Adoption | 60-80% frontier labs | Anthropic, OpenAI, Google DeepMind, Meta have RSPs; 8% of Anthropic staff on security-adjacent work |
| International Coordination | Low-Medium | Bletchley/Seoul summits established; no binding treaties; US-China cooperation minimal |
| Estimated X-Risk Reduction | 5-25% | Conditional on successful international coordination; wide uncertainty range |
| Grade: National Regulation | B+ | EU AI Act most comprehensive framework globally; US AISI faced significant restructuring in 2025 |
| Grade: Industry Standards | B- | RSPs adopted widely but criticized for opacity; SaferAI downgraded Anthropic RSP from 2.2 to 1.9 |
| Grade: International Treaties | C | No binding agreements; BWC has only 4 staff; verification mechanisms absent |
Overview
AI governance encompasses institutions, regulations, and coordination mechanisms designed to shape AI development and deployment for safety and benefit. Unlike technical AI Safety research that solves problems directly, governance creates guardrails, incentives, and coordination mechanisms to reduce catastrophic risk through policy interventions.
This field has rapidly expanded following demonstrations of large language model capabilities and growing concerns about AGI timelines. The Centre for the Governance of AI↗🏛️ government★★★★☆Centre for the Governance of AIGovAI helps decision-makers navigate the transition to a world with advanced AI, by producing rigorous research and fostering talent." name="description"/><meta content="GovAI | HomeGovAI is one of the most prominent AI governance research organizations globally; their publications on AI policy, international coordination, and existential risk governance are frequently cited in AI safety literature and policy discussions.The Centre for the Governance of AI (GovAI) is a leading research organization dedicated to helping decision-makers navigate the transition to a world with advanced AI. It produ...governanceai-safetypolicyexistential-risk+4Source ↗ estimates governance interventions could reduce x-risk by 5-25% if international coordination succeeds, making it potentially one of the highest-leverage approaches to AI safety.
Recent developments demonstrate increasing political momentum: the EU AI Act↗🔗 webEU AI Act – Official Resource HubThis is the primary information hub for the EU AI Act, the landmark 2024 EU regulation that sets legally binding rules for AI development and deployment across the European Union, directly relevant to AI safety governance and policy discussions.The EU AI Act is the world's first comprehensive legal framework for artificial intelligence, establishing a risk-based classification system for AI applications. It imposes var...governancepolicyai-safetydeployment+4Source ↗ entered force in 2024, the US Executive Order on AI↗🏛️ government★★★★☆White HouseExecutive Order on AIThis is the primary U.S. federal government policy document on AI safety as of late 2023; a key reference for AI governance discussions, though its provisions may evolve under subsequent administrations.President Biden's landmark October 2023 Executive Order establishes comprehensive federal policy on AI safety, directing agencies to develop standards, testing requirements, and...governancepolicycomputedeployment+5Source ↗ mandated compute reporting thresholds, and industry Responsible Scaling Policies↗🔗 web★★★★☆AnthropicResponsible Scaling PoliciesAnthropic's RSP is a landmark industry document establishing conditional scaling commitments; it directly influenced the voluntary AI safety commitments made at the UK AI Safety Summit and subsequent similar policies from other frontier AI labs.Anthropic's Responsible Scaling Policy (RSP) establishes a framework of 'AI Safety Levels' (ASLs) that tie capability thresholds to required safety and security measures before ...governancepolicycapabilitiesevaluation+6Source ↗ now cover most frontier labs. However, binding international coordination remains elusive.
AI Governance Ecosystem
Diagram (loading…)
flowchart TD
subgraph International["International Coordination"]
SUMMITS[AI Safety Summits<br/>Bletchley, Seoul, Paris]
UN[UN AI Advisory Body]
NETWORK[AISI Network<br/>11 countries]
end
subgraph National["National Regulation"]
EU[EU AI Act<br/>Penalties up to 7% revenue]
US_GOV[US NIST CAISI<br/>Renamed June 2025]
UK[UK AI Safety Institute<br/>Pre-deployment testing]
CHINA[China AI Regulations<br/>Algorithmic/Generative rules]
end
subgraph Industry["Industry Standards"]
RSP[Responsible Scaling Policies<br/>Anthropic, OpenAI, DeepMind]
EVALS[Capability Evaluations<br/>Bio, cyber, autonomy]
COMMITS[Voluntary Commitments<br/>Seoul 16-company pledge]
end
subgraph Enforcement["Enforcement Mechanisms"]
COMPUTE[Compute Thresholds<br/>10²⁵ EU, 10²⁶ US]
EXPORT[Export Controls<br/>Chip restrictions]
LIABILITY[Liability Frameworks<br/>EU AI Liability Directive]
end
SUMMITS --> NETWORK
UN --> National
NETWORK --> National
EU --> COMPUTE
US_GOV --> COMPUTE
EU --> LIABILITY
US_GOV --> EXPORT
RSP --> EVALS
National --> Industry
EVALS --> COMMITS
style International fill:#e6f3ff
style National fill:#fff3e6
style Industry fill:#e6ffe6
style Enforcement fill:#ffe6e6Risk/Impact Assessment
| Dimension | Assessment | Quantitative Estimate | Confidence |
|---|---|---|---|
| Tractability | Medium | 30-50% chance of meaningful regulation by 2027 in major jurisdictions | Medium |
| Resource Allocation | Growing rapidly | ≈$100M/year globally on AI governance research and advocacy | High |
| Field Size | Expanding | ≈500-1000 dedicated professionals globally, growing 20-30% annually | Medium |
| Political Will | Increasing | 70%+ of G7 countries have active AI governance initiatives | High |
| Estimated X-Risk Reduction | Substantial if coordinated | 5-25% reduction potential from governance approaches | Low |
| Timeline Sensitivity | Critical | Effectiveness drops sharply if deployed after AGI development | High |
Key Arguments for AI Governance
Coordination Problem Resolution
Even perfect technical solutions for AI alignment may fail without governance mechanisms. The racing dynamics problem requires coordination to prevent a "race to the bottom" where competitive pressures override safety considerations. Toby Ord's analysis↗🔗 web★★★★☆Future of Humanity InstituteThe Precipice: Existential Risk and the Future of Humanity – Toby OrdToby Ord's 'The Precipice' is a foundational book in the existential risk field, widely read by AI safety researchers and policymakers; the FHI page serves as the official book homepage with links to resources and endorsements.This page presents Toby Ord's book 'The Precipice,' which argues humanity currently faces unprecedented existential risks, including from advanced AI, and makes a moral case for...existential-riskai-safetygovernancepolicy+3Source ↗ suggests international coordination has historically prevented catastrophic outcomes from nuclear weapons and ozone depletion.
Evidence:
- Nuclear Test Ban Treaty reduced atmospheric testing by >95% after 1963
- Montreal Protocol eliminated 99% of ozone-depleting substances
- But success rate for arms control treaties is only ~40% according to RAND Corporation analysis↗🔗 web★★★★☆RAND CorporationRAND Corporation - Systemic Risk AssessmentA RAND Corporation policy research report relevant to AI governance and systemic risk; useful for those studying how AI failures could cascade across societal systems and what regulatory or international coordination frameworks might mitigate such risks.This RAND Corporation report examines systemic risks posed by advanced AI systems, analyzing how failures or misuse could cascade across interconnected critical systems. It prov...governanceexistential-riskpolicycoordination+4Source ↗
Information Asymmetry Correction
AI companies possess superior information about their systems' capabilities and risks. OpenAI's GPT-4 System Card↗🔗 web★★★★☆OpenAIGPT-4 System CardThis is OpenAI's official safety documentation for GPT-4, widely referenced as an example of pre-deployment risk assessment practice and useful for understanding how frontier labs communicate safety measures to the public.OpenAI's system card for GPT-4 documents safety evaluations, risk assessments, and mitigation measures conducted prior to deployment. It covers dangerous capability evaluations,...ai-safetyevaluationred-teamingdeployment+5Source ↗ revealed concerning capabilities only discovered during testing, highlighting the need for external oversight and mandatory disclosure requirements.
Key mechanisms:
- Pre-deployment testing requirements
- Third-party evaluation access
- Whistleblower protections
- Capability assessment reporting
Market Failure Addressing
Safety is a public good that markets under-provide due to externalized costs. Dario Amodei's analysis↗🔗 web★★★★☆AnthropicDario Amodei's analysisThis is Anthropic's official 2023 statement of core beliefs, written by Dario Amodei and colleagues, and serves as a foundational document for understanding the organization's strategic rationale and research agenda in AI safety.Anthropic's foundational public statement on AI safety, arguing that transformative AI may arrive within a decade due to scaling laws, that current training methods cannot relia...ai-safetyalignmentinterpretabilitycapabilities+6Source ↗ notes that individual companies cannot capture the full benefits of safety investments, creating systematic under-investment without regulatory intervention.
Major Intervention Areas
1. International Coordination
International coordination aims to prevent destructive competition between nations through treaties, institutions, and shared standards.
Recent Progress:
The Bletchley Declaration↗🏛️ government★★★★☆UK GovernmentBletchley DeclarationA foundational government policy document for AI governance researchers; represents the first major multilateral consensus on frontier AI safety risks and is a key reference point for international AI governance developments in 2023-2024.The Bletchley Declaration is a landmark multilateral agreement signed by 28 countries at the UK's AI Safety Summit in November 2023, establishing shared recognition of AI's risk...governancepolicyai-safetycoordination+4Source ↗ (November 2023) achieved consensus among 28 countries on AI risks, followed by the Seoul AI Safety Summit↗🏛️ government★★★★☆UK GovernmentSeoul AI Safety SummitThis is the official UK government hub for the Seoul AI Safety Summit 2024, a major intergovernmental milestone in building international AI safety governance infrastructure, relevant for tracking the evolution of global AI policy coordination.The AI Seoul Summit 2024, co-hosted by the UK and Republic of Korea in May 2024, advanced global AI safety governance by securing international agreements on risk assessment fra...ai-safetygovernancepolicycoordination+4Source ↗ where frontier AI companies made binding safety commitments. The Partnership for Global Inclusivity on AI↗🏛️ governmentPartnership for Global Inclusivity on AIA U.S. State Department program relevant to AI governance researchers tracking how geopolitics and international equity concerns are shaping global AI policy frameworks, particularly around inclusive access and norm-setting.The Partnership for Global Inclusivity on AI (PGIAI) is a U.S. State Department initiative aimed at ensuring that developing nations and underrepresented regions have meaningful...governancepolicycoordinationinternational+3Source ↗ involves 61 countries in governance discussions.
Proposed Institutions:
- International AI Safety Organization (IAISO): Modeled on IAEA, proposed by Yoshua Bengio and others↗📄 paper★★★☆☆arXivYoshua Bengio and othersA high-profile consensus paper co-authored by Turing Award winner Yoshua Bengio, synthesizing expert views on extreme AI risks and calling for urgent combined technical and governance responses; widely cited in AI safety and policy discussions.Yoshua Bengio, Geoffrey Hinton, Andrew Yao et al. (2023)81 citationsThis consensus paper by Yoshua Bengio and colleagues argues that advancing AI systems pose extreme risks—including large-scale social harms, malicious misuse, and irreversible l...ai-safetyexistential-riskgovernancealignment+5Source ↗
- UN AI Advisory Body: Interim report published↗🔗 web★★★★☆United NationsInterim report publishedOfficial UN document from the AI Advisory Body established by the Secretary-General in 2023; represents the multilateral community's early thinking on global AI governance architecture and is relevant to international AI safety coordination efforts.This interim report from the UN Secretary-General's AI Advisory Body examines the governance challenges posed by advanced AI systems and proposes frameworks for international co...governancepolicyinternationalcoordination+4Source ↗ September 2024
- Compute Governance Framework: Lennart Heim's research↗🏛️ government★★★★☆Centre for the Governance of AILennart Heim's researchThis GovAI-hosted paper by Lennart Heim on compute governance is currently returning a 404 error; users should search for the updated URL or look for the paper directly on Heim's publications page or the GovAI website.This page appears to be a research paper by Lennart Heim on compute governance, likely summarizing findings and policy recommendations for governing AI through compute controls....governancecomputepolicyregulation+2Source ↗ proposes international compute monitoring
Impact of Strong International Coordination
Establishing binding international AI governance could substantially reduce existential risk, though expert estimates vary considerably based on assumptions about verification feasibility, compliance mechanisms, and geopolitical dynamics. The range reflects uncertainty about whether international coordination can overcome the technical challenges of monitoring AI development and the political challenges of sustaining cooperation amid strategic competition.
| Expert/Source | Estimate | Reasoning |
|---|---|---|
| Centre for the Governance of AI | 20-40% x-risk reduction | Drawing on historical precedents from nuclear arms control and biological weapons treaties, this estimate reflects moderate optimism about international coordination's potential. The reasoning emphasizes that successful arms control reduced catastrophic risks during the Cold War despite intense geopolitical tensions, suggesting similar mechanisms could work for AI if verification technologies and enforcement frameworks are developed. However, AI's dual-use nature and faster development timelines pose additional challenges compared to nuclear proliferation. |
| RAND Corporation analysis | 15-30% x-risk reduction | This more conservative estimate accounts for significant verification challenges specific to AI systems, including the difficulty of monitoring software-based capabilities and detecting violations through hardware restrictions alone. The analysis emphasizes that compliance incentives depend heavily on whether leading nations perceive coordination as in their strategic interest, and current US-China tensions suggest this remains uncertain. The estimate factors in that even well-designed treaties may fail if major powers view AI supremacy as critical to national security. |
| FHI technical report | 10-50% x-risk reduction | This exceptionally wide range reflects fundamental uncertainty about whether binding international governance can be implemented effectively at all. The lower bound (10%) represents scenarios where treaties are signed but poorly enforced, creating false confidence while racing dynamics continue. The upper bound (50%) represents optimistic scenarios where strong verification mechanisms, credible enforcement, and sustained great power cooperation combine to substantially slow unsafe AI development. The breadth of this range highlights that governance success depends on resolving multiple independent uncertainties simultaneously. |
Key Challenges:
- US-China tensions: Trade war and technology competition complicate cooperation
- Verification complexity: Unlike nuclear weapons, AI capabilities are software-based and harder to monitor
- Enforcement mechanisms: International law lacks binding enforcement for emerging technologies
- Technical evolution: Rapid AI progress outpaces slow treaty negotiation processes
Organizations working on this:
- Centre for the Governance of AI↗🏛️ government★★★★☆Centre for the Governance of AIGovAI helps decision-makers navigate the transition to a world with advanced AI, by producing rigorous research and fostering talent." name="description"/><meta content="GovAI | HomeGovAI is one of the most prominent AI governance research organizations globally; their publications on AI policy, international coordination, and existential risk governance are frequently cited in AI safety literature and policy discussions.The Centre for the Governance of AI (GovAI) is a leading research organization dedicated to helping decision-makers navigate the transition to a world with advanced AI. It produ...governanceai-safetypolicyexistential-risk+4Source ↗ (Oxford)
- Center for Security and Emerging Technology↗🔗 web★★★★☆CSET GeorgetownCSET: AI Market DynamicsCSET is a prominent DC-based think tank whose research on AI governance, compute policy, and geopolitical competition is frequently cited in AI safety and policy discussions; this is their institutional homepage.CSET (Center for Security and Emerging Technology) at Georgetown University is a policy research organization focused on the security implications of emerging technologies, part...governancepolicyai-safetycoordination+2Source ↗ (Georgetown)
- Center for New American Security↗🔗 web★★★★☆CNASCenter for a New American Security (CNAS) - HomepageCNAS is a mainstream national security think tank; relevant to AI safety primarily through its Technology & National Security program covering AI governance and defense AI policy, but not an AI safety-focused organization.CNAS is a Washington D.C.-based national security think tank publishing research on defense, technology policy, economic security, and AI governance. Its Technology & National S...governancepolicyai-safetycapabilities+2Source ↗ (CNAS)
- UN Office of the High Representative for Disarmament Affairs↗🔗 webUN Office of the High Representative for Disarmament AffairsUNODA is relevant to AI safety researchers interested in international governance of autonomous weapons and military AI, as it hosts key UN-level negotiations on lethal autonomous weapons systems (LAWS) and emerging technology risks.UNODA is the United Nations body responsible for promoting multilateral disarmament and non-proliferation across conventional weapons, weapons of mass destruction, and emerging ...governancepolicycoordinationexistential-risk+2Source ↗
2. National Regulation
National governments are implementing comprehensive regulatory frameworks with legally binding requirements.
United States Framework:
The Executive Order on Safe, Secure, and Trustworthy AI↗🏛️ government★★★★☆White HouseExecutive Order on AIThis is the primary U.S. federal government policy document on AI safety as of late 2023; a key reference for AI governance discussions, though its provisions may evolve under subsequent administrations.President Biden's landmark October 2023 Executive Order establishes comprehensive federal policy on AI safety, directing agencies to develop standards, testing requirements, and...governancepolicycomputedeployment+5Source ↗ (October 2023) established:
- Compute reporting threshold: Models using >10²⁶ floating-point operations must report to government
- NIST AI Safety Institute: $200M budget↗🏛️ government★★★★★NISTUS AI Safety InstituteCAISI/AISI is a key institutional actor in U.S. AI governance; relevant for understanding how the federal government approaches AI safety standards, voluntary frameworks, and international coordination on AI risk.The Center for AI Standards and Innovation (CAISI) at NIST is the U.S. government's primary body for AI safety standards and industry coordination. It develops voluntary guideli...ai-safetygovernancepolicyevaluation+4Source ↗ for evaluation capabilities
- Pre-deployment testing: Required for dual-use foundation models
Congressional action includes the CREATE AI Act↗🏛️ government★★★★★US CongressProtecting Military Servicemembers' Data Act of 2023 (H.R.6573)This bill is only tangentially related to AI safety; it concerns data privacy and national security for military personnel rather than AI governance or safety. The metadata title 'CREATE AI Act' appears to be a mislabeling error, as the actual bill is the Protecting Military Servicemembers' Data Act of 2023.H.R.6573 prohibits businesses from selling personal data of U.S. military personnel to four adversarial nations: North Korea, China, Russia, and Iran. The bill addresses nationa...governancepolicydeploymentregulation+1Source ↗, proposing $2.4B for AI research infrastructure, and various algorithmic accountability bills.
European Union AI Act:
The EU AI Act↗🔗 web★★★★☆European UnionEU Artificial Intelligence Act (Regulation 2024/1689)The EU AI Act (effective August 2024, phased implementation through 2027) is the most significant AI-specific legislation globally and a key reference point for AI governance discussions, directly relevant to compute governance and frontier model oversight debates in the AI safety community.The EU AI Act is the world's first comprehensive legal framework regulating artificial intelligence, establishing a risk-based classification system for AI systems with obligati...governancepolicyregulationdeployment+4Source ↗ (entered force August 2024) creates the world's most comprehensive AI regulation:
| Risk Category | Requirements | Penalties |
|---|---|---|
| Prohibited AI | Ban on social scoring, emotion recognition in schools | Up to €35M or 7% global revenue |
| High-Risk AI | Conformity assessment, risk management, human oversight | Up to €15M or 3% global revenue |
| GPAI Models (>10²⁵ FLOP) | Systemic risk evaluation, incident reporting | Up to €15M or 3% global revenue |
| GPAI Models (>10²⁶ FLOP) | Adversarial testing, model cards, code of conduct | Up to €15M or 3% global revenue |
Implementation timeline extends to 2027, with €400M budget↗🔗 web★★★★☆European UnionPage not found | Shaping Europe’s digital futureThe EU AI Act is a major international regulatory development relevant to AI safety governance; the original announcement page is a 404 but the event itself is significant for understanding the global regulatory landscape for AI.This page was intended to announce the EU Artificial Intelligence Act entering into force, a landmark piece of EU regulation establishing a risk-based framework for AI oversight...governancepolicyregulationdeployment+2Source ↗ for enforcement.
United Kingdom Approach:
The UK AI Safety Institute↗🏛️ government★★★★☆UK AI Safety InstituteUK AI Safety Institute (AISI)AISI is a key institutional actor in AI safety, representing one of the first government-led efforts to systematically evaluate frontier AI models; its work and publications are directly relevant to governance, evaluation methodology, and international AI safety coordination.The UK AI Safety Institute (AISI) is the UK government's dedicated body for evaluating and mitigating risks from advanced AI systems. It conducts technical safety research, deve...ai-safetygovernancepolicyevaluation+5Source ↗ focuses on pre-deployment testing and international coordination rather than prescriptive regulation. Key initiatives include:
- Capability evaluations: Testing frontier models before public release
- Safety research: £100M funding for alignment and evaluation research
- International hub: Coordinating with US AISI and other national institutes
Other National Developments:
- China: Draft measures↗🏛️ governmentChina's Draft Generative AI Service Management Measures (2023)This is a primary regulatory document from China's CAC representing one of the earliest government-level frameworks specifically targeting generative AI; important for comparative AI governance analysis and understanding non-Western regulatory approaches to AI safety and oversight.China's Cyberspace Administration published draft regulations for public consultation in April 2023 establishing comprehensive requirements for generative AI service providers i...governancepolicydeploymentregulation+3Source ↗ for algorithmic recommendation and generative AI regulation
- Singapore: Model AI Governance Framework↗🏛️ governmentModel AI Governance FrameworkSingapore's official AI governance framework is frequently cited internationally as a pragmatic, industry-friendly model; relevant for researchers studying national AI policy approaches and technical governance tools.Singapore's PDPC presents a balanced AI governance framework that promotes innovation while protecting consumer interests, operationalized through AI Verify—a testing toolkit va...governancepolicyevaluationdeployment+4Source ↗ for voluntary adoption
- Canada: Proposed Artificial Intelligence and Data Act↗🔗 webProposed Artificial Intelligence and Data ActThis URL references Canada's Bill C-27 third reading, which included the Artificial Intelligence and Data Act (AIDA); the page is currently unavailable as the bill died on the order paper in early 2025 when Parliament was prorogued.Bill C-27 contained Canada's proposed Artificial Intelligence and Data Act (AIDA), which would have established a regulatory framework for high-impact AI systems in Canada. The ...governancepolicyregulationdeployment+2Source ↗ in Parliament
3. Industry Standards and Self-Regulation
Industry-led initiatives aim to establish safety norms before mandatory regulation, with mixed effectiveness.
Responsible Scaling Policies (RSPs):
Anthropic's RSP↗🔗 web★★★★☆AnthropicResponsible Scaling PoliciesAnthropic's RSP is a landmark industry document establishing conditional scaling commitments; it directly influenced the voluntary AI safety commitments made at the UK AI Safety Summit and subsequent similar policies from other frontier AI labs.Anthropic's Responsible Scaling Policy (RSP) establishes a framework of 'AI Safety Levels' (ASLs) that tie capability thresholds to required safety and security measures before ...governancepolicycapabilitiesevaluation+6Source ↗ pioneered the IF-THEN framework:
- IF capabilities reach defined threshold (e.g., autonomous replication ability)
- THEN implement corresponding safeguards (e.g., enhanced containment)
Current adoption:
- Anthropic: ASL-3 now in production (Claude Opus 4 released under ASL-3)↗🔗 web★★★★☆AnthropicResponsible Scaling PolicyAnthropic's RSP is a foundational industry document for responsible development commitments; frequently cited in AI governance discussions as a model for 'if-then' safety commitments from frontier labs.Anthropic's Responsible Scaling Policy (RSP) is a formal commitment outlining how the company will evaluate AI systems for dangerous capabilities and adjust deployment and devel...governancepolicyai-safetyevaluation+5Source ↗, with ASL-2 still applied to lower-capability models
- OpenAI: Preparedness Framework↗🔗 web★★★★☆OpenAISafety & responsibilityThis is OpenAI's public-facing safety landing page; useful as an entry point to their safety infrastructure and Preparedness Framework, but substantive detail is found in linked documents rather than this overview page.OpenAI's safety hub outlines their multi-stage approach to AI safety through teaching (value alignment and content filtering), testing (red teaming and preparedness evaluations)...ai-safetydeploymentred-teamingevaluation+4Source ↗ with risk assessment scorecards
- Google DeepMind: Frontier Safety Framework↗🔗 web★★★★☆Google DeepMindDeepMind Frontier Safety FrameworkAn institutional safety framework from Google DeepMind, comparable to Anthropic's Responsible Scaling Policy and OpenAI's Preparedness Framework; useful for understanding industry approaches to capability thresholds and deployment-gated safety commitments.DeepMind's Frontier Safety Framework (FSF) establishes a structured approach to identifying and mitigating catastrophic risks from highly capable AI models before and during dep...ai-safetygovernanceevaluationdeployment+6Source ↗ for responsible deployment
- Meta: System-level safety approach↗🔗 web★★★★☆Meta AISystem-level safety approachMeta's Purple Llama initiative is a practical industry contribution to AI safety tooling, offering open benchmarks and classifiers relevant to deployment-time safety; useful for practitioners implementing safeguards in LLM-based products.Meta announces Purple Llama, an umbrella project releasing open-source trust and safety tools for generative AI developers. The initial release includes CyberSec Eval (cybersecu...ai-safetyevaluationred-teamingdeployment+4Source ↗ focusing on red-teaming
Effectiveness Assessment:
- Strengths: Rapid implementation, industry buy-in, technical specificity
- Weaknesses: Voluntary nature, competitive pressure, limited external oversight
Voluntary Safety Commitments:
Post-Seoul Summit commitments↗🏛️ government★★★★☆UK GovernmentAi Seoul Summit 2024 Commitments By Ai CompaniesThis UK government page is no longer accessible (404 error); researchers seeking Seoul Summit company commitments should look for archived versions or alternative official sources such as the Seoul AI Safety Institute records or company-published frontier safety frameworks.This UK government page was intended to host the formal safety commitments made by AI companies at the AI Seoul Summit 2024, a follow-up to the Bletchley Park AI Safety Summit. ...governancepolicycoordinationai-safety+3Source ↗ from 16 leading AI companies include:
- Publishing safety frameworks publicly
- Sharing safety research with governments
- Enabling third-party evaluation access
Safety-washing concerns↗🔗 web★★★★☆AnthropicSafety-washing concernsThis URL is a dead link returning a 404 error. The original Anthropic blog post on Constitutional AI should be sought via the Wayback Machine or Anthropic's current website; the concept itself is highly important to alignment research.This resource appears to be a 404 error page, meaning the original Anthropic article on Constitutional AI is no longer accessible at this URL. The intended content would have ex...ai-safetyalignmenttechnical-safetyconstitutional-ai+3Source ↗ highlight the risk of superficial compliance without substantive safety improvements.
Can industry self-regulation be sufficient for catastrophic risk?
Views on whether voluntary commitments can prevent AI catastrophe
60-80% sufficient
30-50%
10-30%
4. Compute Governance
Compute governance↗🏛️ government★★★★☆Centre for the Governance of AILennart Heim's researchThis GovAI-hosted paper by Lennart Heim on compute governance is currently returning a 404 error; users should search for the updated URL or look for the paper directly on Heim's publications page or the GovAI website.This page appears to be a research paper by Lennart Heim on compute governance, likely summarizing findings and policy recommendations for governing AI through compute controls....governancecomputepolicyregulation+2Source ↗ leverages the concentrated, trackable nature of AI training infrastructure to implement upstream controls.
Current Mechanisms:
Export Controls: The October 2022 semiconductor restrictions↗🏛️ government★★★★☆Bureau of Industry and SecurityOctober 2022 semiconductor restrictionsThis BIS rule is a landmark primary source for compute governance discussions, establishing the U.S. regulatory framework that links AI chip performance thresholds to national security export controls, widely cited in AI policy and safety governance literature.This U.S. Bureau of Industry and Security (BIS) press release announces sweeping export control rules targeting advanced computing chips and semiconductor manufacturing equipmen...governancecomputepolicyregulation+4Source ↗ limited China's access to advanced AI chips:
- NVIDIA A100/H100 exports restricted to China
- Updated controls (October 2023)↗🏛️ government★★★★☆Bureau of Industry and SecurityUpdated controls (October 2023)Key U.S. government regulatory action restricting export of advanced AI-relevant semiconductors; directly relevant to compute governance and efforts to limit frontier AI development capabilities in adversarial nations.The Bureau of Industry and Security (BIS) strengthened U.S. export controls in October 2023, with significant new restrictions on advanced semiconductor exports, particularly ta...governancecomputepolicyregulation+3Source ↗ closed loopholes
- Estimated to delay Chinese frontier AI development by 1-3 years according to CSET analysis↗🔗 web★★★★☆CSET GeorgetownThe Semiconductor Supply Chain (CSET Analysis)CSET (Center for Security and Emerging Technology) is a Georgetown-based research group; this analysis is frequently cited in AI governance discussions around compute controls and semiconductor export restrictions.A CSET analysis examining the global semiconductor supply chain, its geographic concentrations, dependencies, and implications for national security and technology competition. ...governancecomputepolicyinternational+2Source ↗
Compute Thresholds:
- EU AI Act: 10²⁵ FLOP threshold for enhanced obligations
- US Executive Order: 10²⁶ FLOP reporting requirement
- UK consideration: Similar thresholds for pre-deployment testing
Proposed Mechanisms:
- Hardware registration: Mandatory tracking of high-performance AI chips
- Cloud compute monitoring: Know-your-customer requirements for large training runs
- International verification: IAEA-style monitoring of frontier AI development
Limitations:
- Algorithmic efficiency gains: Reducing compute requirements for equivalent capabilities
- Distributed training: Splitting computation across many smaller systems
- Semiconductor evolution: New architectures may circumvent current controls
5. Liability and Legal Frameworks
Legal liability mechanisms aim to internalize AI risks and create accountability through courts and regulatory enforcement.
Emerging Frameworks:
Algorithmic Accountability:
- EU AI Liability Directive↗🔗 web★★★★☆European UnionEU AI Liability DirectiveThis is the official EUR-Lex text of the proposed EU AI Liability Directive (CELEX 52022PC0496), a key component of the EU's broader AI regulatory framework alongside the EU AI Act, relevant to those tracking international AI governance and legal accountability mechanisms.The European Commission's proposed AI Liability Directive (2022) establishes rules for civil liability claims related to AI system harms, introducing a rebuttable presumption of...governancepolicyregulationdeployment+2Source ↗ (proposed) creates presumptions of causality
- US state-level algorithmic auditing requirements (e.g., NYC Local Law 144↗🏛️ governmentNYC Local Law 144: Automated Employment Decision Tools RegulationA landmark local law representing one of the first binding US regulations on algorithmic hiring tools; relevant to AI governance researchers studying how jurisdictions are translating AI fairness concerns into enforceable legal requirements at the municipal level.NYC Local Law 144 requires employers to conduct independent bias audits of automated employment decision tools before deployment and to notify affected job candidates and employ...governancepolicydeploymentevaluation+3Source ↗)
Product Liability Extension:
- Treating AI systems as products subject to strict liability
- California SB 1001↗🏛️ governmentCalifornia SB 1001 - Bot Disclosure LawCalifornia SB 1001 is an early state-level AI regulation focused on bot disclosure, relevant to discussions of AI transparency policy and the governance of automated systems in public and commercial communications.California SB 1001 is a state law requiring that automated accounts (bots) disclose their non-human nature when communicating with users online, particularly in commercial or po...governancepolicydeploymentai-safety+1Source ↗ proposed manufacturer liability for AI harms
- Challenge: Establishing causation chains in complex AI systems
Whistleblower Protections:
- EU AI Act Article 85↗🔗 web★★★★☆European UnionEU Artificial Intelligence Act (Regulation 2024/1689)The EU AI Act (effective August 2024, phased implementation through 2027) is the most significant AI-specific legislation globally and a key reference point for AI governance discussions, directly relevant to compute governance and frontier model oversight debates in the AI safety community.The EU AI Act is the world's first comprehensive legal framework regulating artificial intelligence, establishing a risk-based classification system for AI systems with obligati...governancepolicyregulationdeployment+4Source ↗ protects AI whistleblowers
- Proposed US federal legislation for AI safety disclosures
- Industry resistance due to competitive sensitivity concerns
Current State & Trajectory
Regulatory Implementation Timeline
| Jurisdiction | Current Status | 2025 Milestones | 2027 Outlook |
|---|---|---|---|
| EU | AI Act in force, implementation beginning | High-risk AI requirements active | Full enforcement with penalties |
| US | Executive Order implementation ongoing | Potential federal AI legislation | Comprehensive regulatory framework |
| UK | AISI operational, light-touch approach | Pre-deployment testing routine | Possible binding requirements |
| China | Sectoral regulations expanding | Generative AI rules mature | Comprehensive AI law likely |
Industry Compliance Readiness
Anthropic's compliance analysis↗🔗 web★★★★☆AnthropicAnthropic's compliance analysisThis is Anthropic's official response to the EU AI Act, representing a major AI lab's direct engagement with binding international AI regulation; useful for understanding industry perspectives on AI governance and compliance frameworks for frontier models.Anthropic provides a compliance analysis and policy response to the EU AI Act, examining how the regulation's requirements apply to frontier AI systems and offering the company'...governancepolicyregulationai-safety+3Source ↗ estimates:
- Large labs: 70-80% ready for EU AI Act compliance by 2025
- Smaller developers: 40-50% ready, may exit EU market
- Open-source community: Unclear compliance pathway for foundation models
International Coordination Progress
Achieved:
- Regular AI Safety Summit process established
- Voluntary industry commitments from major labs
- Technical cooperation between national AI Safety Institutes
Pending:
- Binding international agreements on AI development restrictions
- Verification and enforcement mechanisms
- China-US cooperation beyond technical exchanges
Key Uncertainties and Cruxes
Technical Feasibility Cruxes
Key Questions
- ?Can AI capabilities be reliably measured and verified for governance purposes?Yes - evaluation methods are improving rapidly
NIST AISI developing standardized benchmarks. Private labs sharing evaluation methods. Compute thresholds provide objective metrics.
→ Governance mechanisms can rely on capability thresholds and testing requirements
Confidence: mediumNo - capabilities are too complex and gaming-proneGoodhart's law applies to benchmarks. Emergent capabilities are unpredictable. Gaming incentives undermine measurement validity.
→ Governance must rely on process requirements rather than capability metrics
Confidence: medium - ?Will export controls remain effective as semiconductor technology evolves?Yes - chokepoints will persist
Advanced chip manufacturing requires specialized equipment and materials. TSMC/Samsung dependencies create controllable bottlenecks.
→ Continue strengthening export control regimes and allied coordination
Confidence: mediumNo - technological diffusion will undermine controlsChina investing heavily in domestic capabilities. Algorithmic efficiency reducing compute requirements. New architectures may bypass restrictions.
→ Shift focus to other governance mechanisms like international agreements
Confidence: low
Geopolitical Coordination Cruxes
The central uncertainty is whether US-China cooperation on AI governance is achievable. Graham Allison's analysis↗🔗 webGraham Allison's analysisRelevant to AI safety insofar as US-China geopolitical rivalry shapes the competitive dynamics and governance challenges around advanced AI development; frequently referenced in discussions of international AI coordination failures.Graham Allison applies the 'Thucydides's Trap' framework to US-China relations, arguing that when a rising power threatens an established hegemon, war is a likely outcome. Drawi...governancepolicycoordinationinternational+2Source ↗ of the "Thucydides Trap" suggests structural forces make cooperation difficult, while Joseph Nye argues↗🔗 webJoseph Nye - Harvard Kennedy School Faculty ProfileThis is a faculty homepage with limited direct AI safety content; useful as a reference for Nye's governance frameworks that inform international AI policy debates, particularly around compute governance and multilateral coordination.Faculty profile page for Joseph Nye, political scientist and former US government official known for concepts like 'soft power' and work on international relations, governance, ...governancepolicycoordinationinternational+2Source ↗ shared existential risks create cooperation incentives.
Evidence for cooperation possibility:
- Both countries face AI Risk from uncontrolled development
- Nuclear arms control precedent during Cold War tensions
- Track 1.5 dialogue continuing through official channels
Evidence against cooperation:
- AI viewed as strategic military technology
- Current trade war and technology restrictions
- Domestic political pressure against appearing weak
Timing and Sequence Cruxes
The relationship between governance timeline and AGI development critically affects intervention effectiveness:
If AGI arrives before governance maturity (3-7 years):
- Focus on emergency measures: compute caps, development moratoria
- International coordination becomes crisis management
- Higher risk of poorly designed but rapidly implemented policies
If governance has time to develop (7+ years):
- Opportunity for evidence-based, iterative policy development
- International institutions can mature gradually
- Lower risk of governance mistakes harming beneficial AI development
Key Organizations and Career Paths
Leading Research Organizations
Academic Institutes:
- Centre for the Governance of AI↗🏛️ government★★★★☆Centre for the Governance of AIGovAI helps decision-makers navigate the transition to a world with advanced AI, by producing rigorous research and fostering talent." name="description"/><meta content="GovAI | HomeGovAI is one of the most prominent AI governance research organizations globally; their publications on AI policy, international coordination, and existential risk governance are frequently cited in AI safety literature and policy discussions.The Centre for the Governance of AI (GovAI) is a leading research organization dedicated to helping decision-makers navigate the transition to a world with advanced AI. It produ...governanceai-safetypolicyexistential-risk+4Source ↗ (Oxford): ~25 researchers, leading governance research
- Center for Security and Emerging Technology↗🔗 web★★★★☆CSET GeorgetownCSET: AI Market DynamicsCSET is a prominent DC-based think tank whose research on AI governance, compute policy, and geopolitical competition is frequently cited in AI safety and policy discussions; this is their institutional homepage.CSET (Center for Security and Emerging Technology) at Georgetown University is a policy research organization focused on the security implications of emerging technologies, part...governancepolicyai-safetycoordination+2Source ↗ (Georgetown): ~40 staff, China expertise and technical analysis
- Stanford Human-Centered AI Institute↗🔗 web★★★★☆Stanford HAIStanford HAI: AI Companions and Mental HealthStanford HAI is a leading academic institution on responsible AI; this page addresses AI companions in mental health contexts, relevant to deployment risks and governance of emotionally sensitive AI applications.Stanford's Human-Centered Artificial Intelligence (HAI) institute explores the intersection of AI companions and mental health, examining benefits, risks, and governance conside...ai-safetygovernancedeploymentpolicy+2Source ↗: Policy research and government engagement
- Belfer Center↗🔗 webBelfer Center for Science and International AffairsThe Belfer Center is a respected policy institution whose research on AI governance and technology regulation is frequently cited in national security and AI policy discussions; relevant for understanding the academic-policy interface in AI governance.The Belfer Center at Harvard Kennedy School is a leading policy research institution focused on international security, technology governance, and global affairs. It produces re...governancepolicyai-safetycoordination+4Source ↗ (Harvard Kennedy School): Technology and national security focus
Think Tanks:
- Center for New American Security↗🔗 web★★★★☆CNASCenter for a New American Security (CNAS) - HomepageCNAS is a mainstream national security think tank; relevant to AI safety primarily through its Technology & National Security program covering AI governance and defense AI policy, but not an AI safety-focused organization.CNAS is a Washington D.C.-based national security think tank publishing research on defense, technology policy, economic security, and AI governance. Its Technology & National S...governancepolicyai-safetycapabilities+2Source ↗: Defense and technology policy
- Brookings Institution↗🔗 web★★★★☆Brookings InstitutionBrookings AI governance trackerBrookings is a prominent think tank; their AI governance tracker is a useful reference for monitoring global regulatory and policy interventions, though content and depth may vary over time.The Brookings Institution maintains an AI governance tracker that monitors policy developments, regulatory proposals, and legislative actions related to artificial intelligence ...governancepolicyai-safetycoordination+2Source ↗: AI governance and regulation analysis
- RAND Corporation↗🔗 web★★★★☆RAND CorporationRAND Provides Objective Research Services and Public Policy AnalysisRAND Corporation's homepage serves as an entry point to a large body of policy-relevant research on AI governance, national security, and emerging technology risks, useful as a reference for policymakers and researchers in the AI safety space.RAND Corporation is a nonprofit research organization providing objective analysis and policy recommendations across a wide range of topics including national security, technolo...governancepolicyai-safetycybersecurity+4Source ↗: Policy analysis and government consulting
- Center for Strategic and International Studies↗🔗 web★★★★☆CSISCenter for Strategic StudiesCSIS is a prominent DC-based think tank whose AI and technology policy work is frequently cited in governance discussions; relevant for tracking policy developments around AI regulation, export controls, and international coordination on AI safety.CSIS is a leading bipartisan policy research organization focused on defense, security, and geopolitical issues. It produces analysis on technology policy, AI governance, cybers...governancepolicyai-safetycoordination+3Source ↗: Technology competition and governance
Government Bodies
National AI Safety Institutes:
- US NIST AI Safety Institute↗🏛️ government★★★★★NISTUS AI Safety InstituteCAISI/AISI is a key institutional actor in U.S. AI governance; relevant for understanding how the federal government approaches AI safety standards, voluntary frameworks, and international coordination on AI risk.The Center for AI Standards and Innovation (CAISI) at NIST is the U.S. government's primary body for AI safety standards and industry coordination. It develops voluntary guideli...ai-safetygovernancepolicyevaluation+4Source ↗: ~100 planned staff, $200M budget
- UK AI Safety Institute↗🏛️ government★★★★☆UK AI Safety InstituteUK AI Safety Institute (AISI)AISI is a key institutional actor in AI safety, representing one of the first government-led efforts to systematically evaluate frontier AI models; its work and publications are directly relevant to governance, evaluation methodology, and international AI safety coordination.The UK AI Safety Institute (AISI) is the UK government's dedicated body for evaluating and mitigating risks from advanced AI systems. It conducts technical safety research, deve...ai-safetygovernancepolicyevaluation+5Source ↗: ~50 staff, pre-deployment testing focus
- EU AI Office↗🔗 web★★★★☆European UnionEU AI Office - European CommissionThe EU AI Office is a key regulatory institution for AI safety practitioners and developers operating in Europe; its mandates and guidelines directly shape how frontier AI models must be evaluated and deployed under the EU AI Act framework.The EU AI Office is the European Commission's central body responsible for overseeing and implementing the EU AI Act, particularly for general-purpose AI models. It coordinates ...governancepolicyai-safetydeployment+3Source ↗: AI Act implementation and enforcement
Advisory Bodies:
- US AI Safety and Security Board↗🏛️ governmentUS AI Safety and Security BoardOfficial DHS board homepage; relevant for tracking U.S. federal AI governance efforts, particularly around critical infrastructure and national security applications of AI.The U.S. Department of Homeland Security's AI Safety and Security Board is a federal initiative to address AI-related safety and security risks to critical infrastructure and na...governancepolicyai-safetydeployment+2Source ↗: Private-public coordination
- UK AI Council↗🏛️ government★★★★☆UK GovernmentUK AI Council (2019–2023)This is the official UK government page for the now-dissolved AI Council; useful as historical context for UK AI governance structures prior to the post-2023 reorganization around DSIT and the AI Safety Institute.The UK AI Council was an independent government advisory body that provided strategic guidance on AI policy, public understanding, skills diversity, and ethical data-sharing fra...governancepolicyai-safetycoordination+1Source ↗: Industry and academic advice
- EU High-Level Expert Group on AI↗🔗 web★★★★☆European UnionEU High-Level Expert Group on AIThis is a key EU policy document that shaped European AI regulation and influenced global AI ethics frameworks; relevant for understanding how safety principles translate into governance structures and regulatory requirements.The EU High-Level Expert Group on AI published Ethics Guidelines for Trustworthy AI, establishing a framework for AI systems that are lawful, ethical, and robust. The guidelines...governancepolicyai-safetyalignment+3Source ↗: Ethics and governance guidance
Career Pathways
Entry Level (0-3 years experience):
- Research Assistant at governance organization ($50-70K)
- Government fellowship programs (TechCongress↗🔗 webTechCongress - Congressional Technology Fellowship ProgramTechCongress is relevant to AI governance efforts as it trains and places technical experts in Congress, potentially influencing AI and compute regulation; useful for those interested in policy pathways and legislative capacity-building for AI oversight.TechCongress is a fellowship program that places technology and science professionals as advisors in the U.S. Congress to help legislators better understand and govern emerging ...governancepolicyregulationcoordination+1Source ↗, AAAS Science & Technology Policy Fellowships↗🔗 webAAAS Science & Technology Policy FellowshipsA career pathway resource for AI safety researchers interested in shaping U.S. technology policy from within government; relevant for those considering science diplomacy or regulatory roles.The AAAS Science & Technology Policy Fellowships place scientists and engineers in federal government positions to contribute technical expertise to U.S. policy development. Fel...governancepolicycoordinationai-safety+1Source ↗) ($80-120K)
- Policy school (MPP/MPA) with AI focus ($80-150K debt typical)
Mid-Level (3-8 years experience):
- Policy researcher at think tank ($80-120K)
- Government policy analyst (GS-13/14, $90-140K)
- Advocacy organization program manager ($90-150K)
Senior Level (8+ years experience):
- Government senior advisor/policy director ($150-200K)
- Think tank research director ($180-250K)
- International organization leadership ($200-300K)
Useful Backgrounds:
- Law (especially administrative, international, technology law)
- Political science/international relations
- Economics (mechanism design, industrial organization)
- Technical background with policy interest
- National security/foreign policy experience
Complementary Interventions
AI governance works most effectively when combined with:
- Technical AI Safety Research: Provides feasible safety requirements for regulation
- AI Safety Evaluations: Enables objective capability and safety assessment
- AI Safety Field Building: Develops governance expertise pipeline
- Corporate AI Safety: Ensures private sector implementation of public requirements
- Public AI Education: Builds political support for governance interventions
Risks and Limitations
Governance Failure Modes
Premature Lock-in:
- Poorly designed early regulations could entrench suboptimal approaches
- Example: EU's GDPR complexity potentially serving as template for AI regulation
- Mitigation: Sunset clauses, regular review requirements, adaptive implementation
Regulatory Capture:
- Incumbent AI companies could shape rules to favor their positions
- OpenAI's advocacy for licensing↗🔗 web★★★★☆OpenAIOpenAI's advocacy for licensingThis is an official OpenAI position piece by Sam Altman and Greg Brockman outlining their views on superintelligence governance; useful for understanding how leading AI labs frame the need for licensing and international oversight.OpenAI's blog post argues that superintelligence may arrive sooner than expected and calls for new governance frameworks, including international coordination and licensing regi...governancepolicyexistential-riskcoordination+4Source ↗ potentially creates barriers to competitors
- Mitigation: Multi-stakeholder input, transparency requirements, conflict-of-interest rules
Innovation Suppression:
- Overly restrictive regulations could slow beneficial AI development
- Open-source AI development particularly vulnerable to compliance costs
- Mitigation: Risk-based approaches, safe harbors for research, impact assessments
Authoritarian Empowerment:
- AI governance infrastructure could facilitate surveillance and control
- China's social credit system demonstrates risks of AI-enabled authoritarianism
- Mitigation: Democratic oversight, civil liberties protections, international monitoring
International Coordination Challenges
Free Rider Problem:
- Countries may benefit from others' safety investments while avoiding costs
- Similar to climate change cooperation difficulties
- Potential solution: Trade linkages, conditional cooperation mechanisms
Verification Difficulties:
- Unlike nuclear weapons, AI capabilities are primarily software-based
- Detection of violations requires access to proprietary code and training processes
- Possible approaches: Hardware monitoring, whistleblower incentives, technical cooperation agreements
Transparency for Intelligence Explosion Detection
Ajeya Cotra has proposed that frontier AI labs adopt a transparency regime specifically designed to detect the onset of an intelligence explosion --- the point at which AI begins significantly accelerating AI research. Proposed reporting requirements include:
| Metric | Frequency | Rationale |
|---|---|---|
| Highest benchmark scores | Quarterly (calendar-based, not release-based) | Dangerous capability jumps may occur internally before any product launch |
| AI adoption in code production | Quarterly | Fraction of pull requests mostly AI-written and AI-reviewed tracks real decision-making authority |
| Safety incident disclosures | Ongoing | Whether models have lied about important matters or covered up logs in real internal use |
| Observed internal productivity | Quarterly | Whether labs are discovering insights faster, the ultimate signal of intelligence explosion onset |
Cotra argues this information should be public rather than shared only with governments, because detecting and responding to an intelligence explosion requires society-wide common knowledge. Whistleblower protections are a critical enabler: several employees at frontier labs have privately expressed concerns about safety incidents but face legal and career risks from disclosure. The RAISE Act (proposed legislation) and California's SB 53 both include whistleblower protection provisions that Cotra considers among the most important elements of AI safety legislation.
Critical Assessment and Evidence Base
Track Record Analysis
Historical precedents for technology governance:
- Nuclear Non-Proliferation Treaty: 191 signatories, but ~10 nuclear weapons states
- Chemical Weapons Convention: 193 parties, largely effective enforcement
- Biological Weapons Convention: 183 parties, but verification challenges remain
- Montreal Protocol: 198 parties, successful phase-out of ozone-depleting substances
Success factors from past agreements:
- Clear verification mechanisms
- Economic incentives for compliance
- Graduated response to violations
- Technical assistance for implementation
AI governance unique challenges:
- Dual-use nature of AI technology
- Rapid pace of technological change
- Diffuse development across many actors
- Difficulty of capability verification
Current Effectiveness Evidence
| Intervention | Measurable Outcomes | Assessment |
|---|---|---|
| EU AI Act implementation | 400+ companies beginning compliance programs | Early stage, full impact unclear |
| US compute reporting thresholds | 6 companies reported to NIST as of late 2024 | Good initial compliance |
| Export controls on China | ≈70% reduction in advanced chip exports to China | Effective short-term, adaptation ongoing |
| Voluntary industry commitments | 16 major labs adopted safety frameworks | High participation, implementation quality varies |
| AI Safety Institute evaluations | ≈10 frontier models evaluated pre-deployment | Establishing precedent for external review |
Resource Requirements and Cost-Effectiveness
Global governance investment estimate: $200-500M annually across all organizations and governments
Potential impact if successful:
- 5-25% reduction in existential risk from AI
- Billions in prevented accident costs
- Improved international stability and cooperation
Cost per unit risk reduction:
- Roughly $10-100M per percentage point of x-risk reduction
- Compares favorably to other longtermist interventions
- But high uncertainty in both costs and effectiveness
Getting Started in AI Governance
Immediate Actions
For Policy Students/Early Career:
- Apply to AI Safety Fundamentals Governance Track↗🔗 webAI Safety Fundamentals Governance TrackThis is an entry-level governance curriculum by BlueDot Impact, useful for those seeking a structured introduction to AI policy and governance; best treated as an onboarding resource rather than primary research.A structured educational curriculum offered by BlueDot Impact covering AI governance fundamentals, designed to help participants understand the policy, regulatory, and instituti...governanceai-safetypolicycoordination+4Source ↗
- Read core papers from Centre for the Governance of AI↗🏛️ government★★★★☆Centre for the Governance of AIGovAI Research PublicationsGovAI is a leading AI governance research organization; this page indexes their full publication library and is useful for tracking policy-relevant AI safety research across technical and societal dimensions.The Centre for the Governance of AI (GovAI) research hub aggregates policy-relevant technical and governance research on frontier AI systems, covering topics from biosecurity an...governancepolicyai-safetyevaluation+5Source ↗
- Follow policy developments via Import AI Newsletter↗🔗 webImport AI NewsletterA widely-followed newsletter by Anthropic co-founder Jack Clark; useful for tracking AI capability and policy developments as they emerge, and understanding how prominent safety-oriented figures contextualize new research.Import AI is a weekly newsletter by Jack Clark (co-founder of Anthropic and former OpenAI policy director) covering the latest developments in artificial intelligence research, ...ai-safetygovernancepolicycapabilities+5Source ↗, AI Policy & Governance Newsletter↗🔗 webAI Policy & Governance NewsletterA newsletter aggregating AI policy and governance news; useful for staying current on regulatory developments, but content could not be verified directly. Relevance to technical AI safety work is indirect.A newsletter focused on AI policy and governance developments, covering regulatory updates, international coordination efforts, and compute governance issues. The resource appea...governancepolicycoordinationcompute+3Source ↗
- Apply for fellowships: TechCongress↗🔗 webTechCongress - Congressional Technology Fellowship ProgramTechCongress is relevant to AI governance efforts as it trains and places technical experts in Congress, potentially influencing AI and compute regulation; useful for those interested in policy pathways and legislative capacity-building for AI oversight.TechCongress is a fellowship program that places technology and science professionals as advisors in the U.S. Congress to help legislators better understand and govern emerging ...governancepolicyregulationcoordination+1Source ↗, CSET Research↗🔗 web★★★★☆CSET GeorgetownCSET Careers PageThis is the careers/jobs page for CSET, a leading AI policy and governance research center at Georgetown; useful only for those seeking employment, not a substantive research resource.This is the careers and job listings page for the Center for Security and Emerging Technology (CSET) at Georgetown University. It lists open positions, student opportunities, an...governancepolicyai-safetySource ↗
For Experienced Professionals:
- Transition via AI Policy Entrepreneurship↗🔗 web★★★★☆Coefficient GivingAI Policy EntrepreneurshipA grant listing from Open Philanthropy documenting support for AI policy capacity-building; useful as evidence of funding landscape and institutional prioritization of AI governance work.This Open Philanthropy grant page documents funding provided to the Center for AI Policy Entrepreneurship, supporting efforts to develop and promote effective AI governance poli...governancepolicyai-safetyregulation+3Source ↗ program
- Engage with Partnership on AI↗🔗 web★★★☆☆Partnership on AIPartnership on AI (PAI) – Multi-Stakeholder AI Governance OrganizationPAI is a major multi-stakeholder governance body relevant to AI safety researchers interested in policy coordination, industry norms, and the institutional landscape surrounding responsible AI deployment.Partnership on AI (PAI) is a nonprofit coalition of AI researchers, civil society organizations, academics, and companies working to develop best practices, conduct research, an...governanceai-safetypolicycoordination+2Source ↗ working groups
- Contribute expertise to NIST AI Risk Management Framework↗🏛️ government★★★★★NISTNIST AI Risk Management FrameworkThe NIST AI RMF is a widely referenced U.S. government standard for AI risk governance, frequently cited in policy discussions and used by organizations building internal AI safety and compliance programs; relevant to AI safety researchers tracking institutional governance approaches.The NIST AI RMF is a voluntary, consensus-driven framework released in January 2023 to help organizations identify, assess, and manage risks associated with AI systems while pro...governancepolicyai-safetydeployment+4Source ↗ development
- Join professional networks: AI Policy Network↗🔗 webAI Policy NetworkA GitHub-hosted policy network site focused on AI governance and compute regulation; content was not accessible at time of analysis, so metadata is inferred from available tags and URL.The AI Policy Network appears to be a collaborative platform or organization focused on AI governance and policy coordination across international and institutional boundaries. ...governancepolicycoordinationcompute+4Source ↗, governance researcher communities
Skills Development Priorities
High-priority skills:
- Policy analysis and development
- International relations and diplomacy
- Technical understanding of AI capabilities
- Stakeholder engagement and coalition building
- Regulatory design and implementation
Medium-priority skills:
- Economics of technology regulation
- Legal framework analysis
- Public communication and advocacy
- Cross-cultural competency (especially US-China relations)
Related Interventions and Cross-Links
References
CSIS is a leading bipartisan policy research organization focused on defense, security, and geopolitical issues. It produces analysis on technology policy, AI governance, cybersecurity, and international competition relevant to AI safety and emerging technology governance. Its work informs U.S. government and allied nation decision-making on critical technology issues.
This RAND Corporation report examines systemic risks posed by advanced AI systems, analyzing how failures or misuse could cascade across interconnected critical systems. It provides a structured framework for understanding risk pathways and governance interventions at national and international levels. The report aims to inform policymakers on proactive risk mitigation strategies.
This interim report from the UN Secretary-General's AI Advisory Body examines the governance challenges posed by advanced AI systems and proposes frameworks for international cooperation. It analyzes risks and opportunities of AI at the global level, with particular focus on ensuring AI development benefits all nations including the Global South. The report lays groundwork for recommendations on international AI governance architecture.
The AI Policy Network appears to be a collaborative platform or organization focused on AI governance and policy coordination across international and institutional boundaries. It likely serves as a hub for researchers, policymakers, and stakeholders working on AI regulation and compute governance frameworks. Without accessible content, the scope is inferred from its tags and URL structure.
RAND Corporation is a nonprofit research organization providing objective analysis and policy recommendations across a wide range of topics including national security, technology, governance, and emerging risks. It produces influential studies on AI policy, cybersecurity, and global governance challenges. RAND's work is frequently cited by governments and policymakers worldwide.
Faculty profile page for Joseph Nye, political scientist and former US government official known for concepts like 'soft power' and work on international relations, governance, and technology policy. His work increasingly addresses AI governance and cyber security in the context of great power competition. Relevant to AI safety discussions around international coordination and governance frameworks.
Partnership on AI (PAI) is a nonprofit coalition of AI researchers, civil society organizations, academics, and companies working to develop best practices, conduct research, and shape policy around responsible AI development. It brings together diverse stakeholders to address challenges including safety, fairness, transparency, and the societal impacts of AI systems. PAI serves as a coordination hub for cross-sector dialogue on AI governance.
The EU AI Act is the world's first comprehensive legal framework for artificial intelligence, establishing a risk-based classification system for AI applications. It imposes varying obligations on developers and deployers depending on the risk level of their AI systems, from minimal-risk to unacceptable-risk categories. The act sets precedents for global AI governance and compliance requirements.
The AI Seoul Summit 2024, co-hosted by the UK and Republic of Korea in May 2024, advanced global AI safety governance by securing international agreements on risk assessment frameworks, launching the first international network of AI Safety Institutes, and obtaining safety commitments from 16 major AI companies worldwide. It built on the Bletchley Park AI Safety Summit of November 2023 as part of an ongoing international diplomatic process.
Anthropic's foundational public statement on AI safety, arguing that transformative AI may arrive within a decade due to scaling laws, that current training methods cannot reliably ensure safe behavior, and that urgent multi-faceted safety research—including mechanistic interpretability, scaling supervision, and process-oriented learning—is essential to prevent catastrophic outcomes.
This resource appears to be a 404 error page, meaning the original Anthropic article on Constitutional AI is no longer accessible at this URL. The intended content would have explained Anthropic's Constitutional AI approach, a method for training AI systems to be helpful, harmless, and honest using a set of guiding principles.
The Bureau of Industry and Security (BIS) strengthened U.S. export controls in October 2023, with significant new restrictions on advanced semiconductor exports, particularly targeting China. The update includes expanded controls on chips used for AI training and advanced computing, closing loopholes from the October 2022 rules and extending restrictions to additional countries.
Meta announces Purple Llama, an umbrella project releasing open-source trust and safety tools for generative AI developers. The initial release includes CyberSec Eval (cybersecurity safety benchmarks for LLMs) and Llama Guard (an input/output safety classifier), aiming to democratize access to safety infrastructure for responsible AI deployment.
OpenAI's safety hub outlines their multi-stage approach to AI safety through teaching (value alignment and content filtering), testing (red teaming and preparedness evaluations), and sharing (real-world feedback loops). It covers key concern areas including child safety, deepfakes, bias, and election integrity, and links to their Preparedness Framework and related safety documentation.
A CSET analysis examining the global semiconductor supply chain, its geographic concentrations, dependencies, and implications for national security and technology competition. The analysis maps key chokepoints and vulnerabilities relevant to AI compute governance and export controls.
The Bletchley Declaration is a landmark multilateral agreement signed by 28 countries at the UK's AI Safety Summit in November 2023, establishing shared recognition of AI's risks and opportunities. It represents the first major international consensus document specifically focused on frontier AI safety, committing signatories to cooperative risk assessment and governance frameworks.
This is the careers and job listings page for the Center for Security and Emerging Technology (CSET) at Georgetown University. It lists open positions, student opportunities, and provides resources like FAQs and hiring info session recordings for prospective research analysts, data research analysts, and research fellows.
The U.S. Department of Homeland Security's AI Safety and Security Board is a federal initiative to address AI-related safety and security risks to critical infrastructure and national security. It brings together government, industry, and civil society stakeholders to develop guidance and best practices for safe AI deployment. The board represents a major U.S. government effort to operationalize AI governance at the national security level.
The NIST AI RMF is a voluntary, consensus-driven framework released in January 2023 to help organizations identify, assess, and manage risks associated with AI systems while promoting trustworthiness across design, development, deployment, and evaluation. It provides structured guidance organized around core functions and is accompanied by a Playbook, Roadmap, and a Generative AI Profile (2024) addressing risks specific to generative AI systems.
The Centre for the Governance of AI (GovAI) research hub aggregates policy-relevant technical and governance research on frontier AI systems, covering topics from biosecurity and cybercrime to labor market impacts and AI auditing. It serves as a comprehensive repository of GovAI's publications spanning multiple years and research themes. The page indexes papers addressing near-term and long-term risks from advanced AI systems.
CNAS is a Washington D.C.-based national security think tank publishing research on defense, technology policy, economic security, and AI governance. Its Technology & National Security program produces policy-relevant work on AI, cybersecurity, and emerging technologies with implications for AI safety and governance.
This UK government page was intended to host the formal safety commitments made by AI companies at the AI Seoul Summit 2024, a follow-up to the Bletchley Park AI Safety Summit. The page currently returns a 404 error, indicating the content has been moved or removed.
Bill C-27 contained Canada's proposed Artificial Intelligence and Data Act (AIDA), which would have established a regulatory framework for high-impact AI systems in Canada. The page is no longer accessible, suggesting the bill's legislative status has changed or the page was moved. AIDA was part of a broader digital charter implementation act alongside privacy law reforms.
UNODA is the United Nations body responsible for promoting multilateral disarmament and non-proliferation across conventional weapons, weapons of mass destruction, and emerging technologies. It facilitates international treaties, norms, and dialogue on the governance of potentially destabilizing technologies, including autonomous weapons systems and AI in military contexts. It serves as a key international coordination point for efforts to prevent catastrophic risks from advanced weaponry.
25The Precipice: Existential Risk and the Future of Humanity – Toby OrdFuture of Humanity Institute▸
This page presents Toby Ord's book 'The Precipice,' which argues humanity currently faces unprecedented existential risks, including from advanced AI, and makes a moral case for prioritizing their reduction. Ord provides probability estimates for various catastrophic and existential risks and argues this century is uniquely critical for humanity's long-term future.
OpenAI's blog post argues that superintelligence may arrive sooner than expected and calls for new governance frameworks, including international coordination and licensing regimes for the most powerful AI systems. It outlines OpenAI's views on how society should prepare for and oversee AI systems that could surpass human-level capabilities across most domains.
A newsletter focused on AI policy and governance developments, covering regulatory updates, international coordination efforts, and compute governance issues. The resource appears to aggregate and analyze ongoing policy discussions relevant to AI safety and oversight. Specific content is unavailable, but the tags suggest coverage of governance frameworks and international AI regulation.
The Brookings Institution maintains an AI governance tracker that monitors policy developments, regulatory proposals, and legislative actions related to artificial intelligence across jurisdictions. It serves as a reference resource for tracking the evolving landscape of AI governance initiatives globally.
This U.S. Bureau of Industry and Security (BIS) press release announces sweeping export control rules targeting advanced computing chips and semiconductor manufacturing equipment, aimed at preventing China from acquiring or producing advanced semiconductors used in AI and military applications. The rules restrict exports of high-performance chips (including A100/H100-class GPUs), chip manufacturing tools, and impose restrictions on U.S. persons supporting Chinese chipmaking. This represents a major inflection point in compute governance and AI geopolitics.
This Open Philanthropy grant page documents funding provided to the Center for AI Policy Entrepreneurship, supporting efforts to develop and promote effective AI governance policies. The grant reflects Open Philanthropy's investment in building the policy capacity needed to address risks from advanced AI systems.
Graham Allison applies the 'Thucydides's Trap' framework to US-China relations, arguing that when a rising power threatens an established hegemon, war is a likely outcome. Drawing on historical case studies, he examines whether the US and China can avoid great-power conflict through strategic statecraft. The analysis has significant implications for understanding geopolitical risks surrounding AI competition and technology governance.
President Biden's landmark October 2023 Executive Order establishes comprehensive federal policy on AI safety, directing agencies to develop standards, testing requirements, and oversight mechanisms for advanced AI systems. It mandates safety evaluations for frontier AI models, addresses risks to national security and critical infrastructure, and promotes international coordination on AI governance. The order leverages the Defense Production Act to require developers of powerful AI systems to share safety test results with the federal government.
33NYC Local Law 144: Automated Employment Decision Tools Regulationlegistar.council.nyc.gov·Government▸
NYC Local Law 144 requires employers to conduct independent bias audits of automated employment decision tools before deployment and to notify affected job candidates and employees. The law mandates transparency about what characteristics these AI systems evaluate and imposes civil penalties for violations, making it one of the first local laws in the US to directly regulate algorithmic hiring tools.
The EU AI Act is the world's first comprehensive legal framework regulating artificial intelligence, establishing a risk-based classification system for AI systems with obligations scaled to potential harm. It bans certain AI applications outright, imposes strict requirements on high-risk systems, and creates transparency obligations for general-purpose AI models including those with systemic risk. The regulation applies to providers, deployers, and importers operating in the EU market.
China's Cyberspace Administration published draft regulations for public consultation in April 2023 establishing comprehensive requirements for generative AI service providers in China. The draft covers content safety aligned with socialist values, data governance, user protection, algorithmic accountability, and security assessments. It represents one of the world's first major national regulatory frameworks specifically targeting generative AI.
The EU High-Level Expert Group on AI published Ethics Guidelines for Trustworthy AI, establishing a framework for AI systems that are lawful, ethical, and robust. The guidelines introduce seven key requirements for trustworthy AI including human agency, privacy, transparency, and accountability. This document became foundational to the EU's broader AI regulatory agenda, influencing the EU AI Act.
This consensus paper by Yoshua Bengio and colleagues argues that advancing AI systems pose extreme risks—including large-scale social harms, malicious misuse, and irreversible loss of human control—that current safety research and governance mechanisms are inadequate to address. The authors propose a comprehensive response combining technical AI safety research with proactive, adaptive governance frameworks, drawing on lessons from other safety-critical technologies.
A structured educational curriculum offered by BlueDot Impact covering AI governance fundamentals, designed to help participants understand the policy, regulatory, and institutional landscape around AI safety. The course covers topics such as AI risks, compute governance, international coordination, and regulatory approaches to ensure safe AI development.
The AAAS Science & Technology Policy Fellowships place scientists and engineers in federal government positions to contribute technical expertise to U.S. policy development. Fellows are embedded in Congress, executive branch agencies, and international bodies to bridge the gap between scientific knowledge and policy decisions. This program is a pathway for technically trained individuals to influence AI governance and technology regulation from within government.
Stanford's Human-Centered Artificial Intelligence (HAI) institute explores the intersection of AI companions and mental health, examining benefits, risks, and governance considerations of AI-powered emotional support tools. The resource reflects HAI's broader mission of responsible AI development that centers human well-being.
The European Commission's proposed AI Liability Directive (2022) establishes rules for civil liability claims related to AI system harms, introducing a rebuttable presumption of causality to ease the burden of proof for victims. It complements the EU AI Act by addressing how existing liability frameworks apply to AI-specific harms. The directive aims to ensure that victims of AI-caused damage have equivalent legal protection to victims of non-AI harms.
California SB 1001 is a state law requiring that automated accounts (bots) disclose their non-human nature when communicating with users online, particularly in commercial or political contexts. The law aims to prevent deceptive use of AI-driven bots in influencing public opinion or commercial transactions. It represents an early example of state-level AI transparency and disclosure regulation.
The Center for AI Standards and Innovation (CAISI) at NIST is the U.S. government's primary body for AI safety standards and industry coordination. It develops voluntary guidelines, evaluates AI systems for national security risks (cybersecurity, biosecurity), and represents U.S. interests in international AI standards efforts.
The Partnership for Global Inclusivity on AI (PGIAI) is a U.S. State Department initiative aimed at ensuring that developing nations and underrepresented regions have meaningful access to and voice in the global AI governance ecosystem. It focuses on bridging the AI divide by mobilizing resources, building capacity, and fostering international cooperation so that the benefits and governance of AI are not concentrated solely among wealthy nations.
Anthropic's Responsible Scaling Policy (RSP) establishes a framework of 'AI Safety Levels' (ASLs) that tie capability thresholds to required safety and security measures before further scaling or deployment. It commits Anthropic to pausing development if safety measures cannot keep pace with capability advances, representing one of the first formal industry commitments to conditional scaling.
The UK AI Council was an independent government advisory body that provided strategic guidance on AI policy, public understanding, skills diversity, and ethical data-sharing frameworks until its dissolution in June 2023. Comprising industry, academia, and public sector representatives, it served as a bridge between the AI ecosystem and UK government. Its closure marked a shift toward individual expert advisory roles within the Department for Science, Innovation and Technology.
H.R.6573 prohibits businesses from selling personal data of U.S. military personnel to four adversarial nations: North Korea, China, Russia, and Iran. The bill addresses national security risks from foreign adversaries acquiring sensitive information about Armed Forces members, with enforcement authority granted to the FTC and state attorneys general.
TechCongress is a fellowship program that places technology and science professionals as advisors in the U.S. Congress to help legislators better understand and govern emerging technologies. The program aims to bridge the expertise gap between tech industry knowledge and congressional policymaking, including on issues like AI regulation and compute governance.
DeepMind's Frontier Safety Framework (FSF) establishes a structured approach to identifying and mitigating catastrophic risks from highly capable AI models before and during deployment. It introduces 'Critical Capability Levels' (CCLs) as thresholds that trigger enhanced safety evaluations, and outlines mitigation measures to prevent severe harms such as bioweapons development or AI autonomously undermining human oversight. The framework represents a concrete institutional commitment to capability-gated safety protocols.
This page appears to be a research paper by Lennart Heim on compute governance, likely summarizing findings and policy recommendations for governing AI through compute controls. The page is currently returning a 404 error, suggesting the content has been moved or is unavailable.
Anthropic provides a compliance analysis and policy response to the EU AI Act, examining how the regulation's requirements apply to frontier AI systems and offering the company's perspective on key provisions. The document reflects Anthropic's engagement with international AI governance frameworks and its approach to regulatory compliance for advanced AI models.
OpenAI's system card for GPT-4 documents safety evaluations, risk assessments, and mitigation measures conducted prior to deployment. It covers dangerous capability evaluations, red-teaming findings, and the RLHF-based safety interventions applied to reduce harmful outputs. The document represents OpenAI's public accountability framework for responsible deployment of a frontier AI model.
Singapore's PDPC presents a balanced AI governance framework that promotes innovation while protecting consumer interests, operationalized through AI Verify—a testing toolkit validating AI systems against 11 governance principles including transparency, fairness, safety, and accountability. The framework provides organizations with standardized methods for testing supervised-learning models and generating transparency reports for stakeholders. The associated AI Verify Foundation, backed by Google, IBM, and Microsoft, drives open-source AI testing capabilities as a global reference for responsible AI development.
CSET (Center for Security and Emerging Technology) at Georgetown University is a policy research organization focused on the security implications of emerging technologies, particularly AI. It produces research on AI policy, workforce, geopolitics, and governance. The content could not be fully extracted, limiting detailed analysis.
Import AI is a weekly newsletter by Jack Clark (co-founder of Anthropic and former OpenAI policy director) covering the latest developments in artificial intelligence research, policy, and safety. It curates and analyzes significant AI papers, industry trends, and governance developments, offering expert commentary on their implications. The newsletter is widely read in the AI research and policy community.
The Centre for the Governance of AI (GovAI) is a leading research organization dedicated to helping decision-makers navigate the transition to a world with advanced AI. It produces rigorous research on AI governance, policy, and societal impacts, while fostering a global talent pipeline for responsible AI oversight. GovAI bridges technical AI safety concerns with practical policy recommendations.
The Belfer Center at Harvard Kennedy School is a leading policy research institution focused on international security, technology governance, and global affairs. It produces research and policy recommendations on emerging technology risks including AI, cybersecurity, and nuclear security. The center bridges academic research and policy practice through fellowships, publications, and engagement with government and industry.
The EU AI Office is the European Commission's central body responsible for overseeing and implementing the EU AI Act, particularly for general-purpose AI models. It coordinates AI governance across member states, enforces compliance with AI safety requirements, and supports the development of AI standards and testing methodologies.
This page was intended to announce the EU Artificial Intelligence Act entering into force, a landmark piece of EU regulation establishing a risk-based framework for AI oversight. The page is currently unavailable (404 error), but the EU AI Act came into force in August 2024, representing the world's first comprehensive horizontal AI regulation. It sets binding requirements for high-risk AI systems and bans certain AI applications across EU member states.
The UK AI Safety Institute (AISI) is the UK government's dedicated body for evaluating and mitigating risks from advanced AI systems. It conducts technical safety research, develops evaluation frameworks for frontier AI models, and works with international partners to inform global AI governance and policy.