AI Safety Intervention Portfolio

Approach

AI Safety Intervention Portfolio

Provides a strategic framework for AI safety resource allocation by mapping 13+ interventions against 4 risk categories, evaluating each on ITN dimensions, and identifying portfolio gaps (epistemic resilience severely neglected, technical work over-concentrated in frontier labs). Total field investment ~$650M annually with 1,100 FTEs (21% annual growth), but 85% of external funding from 5 sources and safety/capabilities ratio at only 0.5-1.3%. Recommends rebalancing from very high RLHF investment toward evaluations (very high priority), AI control and compute governance (both high priority), with epistemic resilience increasing from very low to medium allocation.

Organizations

Approaches

Research Areas

2.8k words · 2 backlinks

Quick Assessment

Dimension	Assessment	Evidence
Tractability	Medium-High	Varies widely: evaluations (high), compute governance (high), international coordination (low). Coefficient Giving's 2025 RFP allocated $40M for technical safety research.
Scalability	High	Portfolio approach scales across 4 risk categories and multiple timelines. AI Safety Field Growth Analysis shows 21% annual FTE growth rate.
Current Maturity	Medium	Core interventions established; significant gaps in epistemic resilience (less than 5% of portfolio) and post-incident recovery (under 1%).
Research Workforce	≈1,100 FTEs	600 technical + 500 non-technical AI safety FTEs in 2025, up from 400 total in 2022 (AI Safety Field Growth Analysis).
Time Horizon	Near-Long	Near-term (evaluations, control) complement long-term work (interpretability, governance). International AI Safety Report 2025 emphasizes urgency.
Funding Level	$110-130M/year external	2024 external funding. Early 2025 shows 40-50% acceleration with $67M committed through July. Internal lab spending adds $500-550M for ≈$650M total (Coefficient Giving analysis).
Funding Concentration	85% from 5 sources	Coefficient Giving: $63.6M (60%); Jaan Tallinn: $20M; Eric Schmidt: $10M; AI Safety Fund: $10M; FLI: $5M
Safety/Capabilities Ratio	≈0.5-1.3%	$600-650M safety vs $50B+ capabilities spending. FAS recommends 30% of compute for safety research.

Key Links

Source	Link
Official Website	mop.wiki
Wikipedia	en.wikipedia.org

Overview

This page provides a strategic view of the AI safety intervention landscape, analyzing how different interventions address different risk categories. Rather than examining interventions individually, this portfolio view helps identify coverage gaps, complementarities, and allocation priorities.

The intervention landscape can be divided into several categories: technical approaches (alignment, interpretability, control), governance mechanisms (legislation, compute governance, international coordination), field building (talent, funding, community), and resilience measures (epistemic security, economic adaptation). Each category has different tractability profiles, timelines, and risk coverage—understanding these tradeoffs is essential for strategic resource allocation.

An effective safety portfolio requires both breadth (covering diverse failure modes) and depth (sufficient investment in each area to achieve impact). The current portfolio shows significant concentration in certain areas (RLHF, capability evaluations) while other areas remain relatively neglected (epistemic resilience, international coordination).

Field Growth Trajectory

Metric	2022	2025	Growth Rate	Notes
Technical AI Safety FTEs	300	600	21%/year	AI Safety Field Growth Analysis 2025
Non-Technical AI Safety FTEs	100	500	71%/year	Governance, policy, operations
Total AI Safety FTEs	400	1,100	40%/year	Field-wide compound growth
AI Safety Organizations	≈50	≈120	24%/year	Exponential growth since 2020
Capabilities FTEs (comparison)	≈3,000	≈15,000	30-40%/year	OpenAI alone: 300 → 3,000

Critical Comparison: While AI safety workforce has grown substantially, capabilities research is growing 30-40% per year. The ratio of capabilities to safety researchers has remained roughly constant at 10-15:1, meaning the absolute gap continues to widen.

Top Research Categories (by FTEs):

Miscellaneous technical AI safety research
LLM safety
Interpretability

Intervention Categories and Risk Coverage

Diagram (loading…)

flowchart TD
  subgraph Technical["Technical Approaches"]
      INT[Interpretability]
      CTRL[AI Control]
      ALIGN[Alignment Research]
      EVAL[Evaluations]
  end

  subgraph Governance["Governance"]
      COMP[Compute Governance]
      LEG[Legislation]
      INTL[International Coordination]
      RSP[Responsible Scaling]
  end

  subgraph Meta["Field Building & Resilience"]
      FIELD[Field Building]
      EPIST[Epistemic Resilience]
      ECON[Economic Resilience]
  end

  subgraph Risks["Risk Categories"]
      ACC[Accident Risks]
      MIS[Misuse Risks]
      STR[Structural Risks]
      EPI[Epistemic Risks]
  end

  INT --> ACC
  CTRL --> ACC
  ALIGN --> ACC
  EVAL --> ACC
  EVAL --> MIS

  COMP --> MIS
  COMP --> STR
  LEG --> MIS
  LEG --> STR
  INTL --> STR
  RSP --> ACC
  RSP --> MIS

  FIELD --> ACC
  FIELD --> STR
  EPIST --> EPI
  ECON --> STR

  style ACC fill:#ffcccc
  style MIS fill:#ffe6cc
  style STR fill:#fff3cc
  style EPI fill:#e6ccff
  style Technical fill:#cce6ff
  style Governance fill:#ccffcc
  style Meta fill:#ffccff

Intervention by Risk Matrix

This matrix shows how strongly each major intervention addresses each risk category. Ratings are based on current evidence and expert assessments.

Intervention	Accident Risks	Misuse Risks	Structural Risks	Epistemic Risks	Primary Mechanism
Interpretability	High	Low	Low	--	Detect deception and misalignment in model internals
AI Control	High	Medium	--	--	External constraints regardless of AI intentions
Evaluations	High	Medium	Low	--	Pre-deployment testing for dangerous capabilities
RLHF/Constitutional AI	Medium	Medium	--	--	Train models to follow human preferences
Scalable Oversight	Medium	Low	--	--	Human supervision of superhuman systems
Compute Governance	Low	High	Medium	--	Hardware chokepoints limit access
Export Controls	Low	High	Medium	--	Restrict adversary access to training compute
Responsible Scaling	Medium	Medium	Low	--	Capability thresholds trigger safety requirements
International Coordination	Low	Medium	High	--	Reduce racing dynamics through agreements
AI Safety Institutes	Medium	Medium	Medium	--	Government capacity for evaluation and oversight
Field Building	Medium	Low	Medium	Low	Grow talent pipeline and research capacity
Epistemic Security	--	Low	Low	High	Protect collective truth-finding capacity
Content Authentication	--	Medium	--	High	Verify authentic content in synthetic era

Legend: High = primary focus, addresses directly; Medium = secondary impact; Low = indirect or limited; -- = minimal relevance

Prioritization Framework

This framework evaluates interventions across the standard Importance-Tractability-Neglectedness (ITN) dimensions, with additional consideration for timeline fit and portfolio complementarity.

Intervention	Tractability	Impact Potential	Neglectedness	Timeline Fit	Overall Priority
Interpretability	Medium	High	Low	Long	High
AI Control	High	Medium-High	Medium	Near	Very High
Evaluations	High	Medium	Low	Near	High
Compute Governance	High	High	Low	Near	Very High
International Coordination	Low	Very High	High	Long	High
Field Building	High	Medium	Medium	Ongoing	Medium-High
Epistemic Resilience	Medium	Medium	High	Near-Long	Medium-High
Scalable Oversight	Medium-Low	High	Medium	Long	Medium

Prioritization Rationale

Very High Priority:

AI Control scores highly because it provides near-term safety benefits (70-85% tractability for human-level systems) regardless of whether alignment succeeds. It represents a practical bridge during the transition period. Redwood Research received $1.2M for control research in 2024.
Compute Governance is one of few levers creating physical constraints on AI development. Hardware chokepoints exist, some measures are already implemented (EU AI Act compute thresholds, US export controls), and impact potential is substantial. GovAI produces leading research on compute governance mechanisms.

High Priority:

Interpretability is potentially essential if alignment proves difficult (only reliable way to detect sophisticated deception). MIT Technology Review named mechanistic interpretability a 2026 Breakthrough Technology. Anthropic's attribution graphs revealed hidden reasoning in Claude 3.5 Haiku. FAS recommends federal R&D funding through DARPA and NSF.
Evaluations provide measurable near-term impact and are already standard practice at major labs. Coefficient Giving launched an RFP for capability evaluations ($200K-$5M grants). METR partners with Anthropic and OpenAI on frontier model evaluations. NIST invested $20M in AI Economic Security Centers.
International Coordination has very high impact potential for addressing structural risks like racing dynamics, but low tractability given current geopolitical tensions. The International AI Safety Report 2025, led by Yoshua Bengio with 100+ authors from 30 countries, represents the largest global collaboration to date.

Medium-High Priority:

Field Building and Epistemic Resilience are relatively neglected meta-level interventions that multiply the effectiveness of direct technical and governance work. 80,000 Hours notes good funding opportunities in AI safety exist for qualified researchers.

Portfolio Gaps and Complementarities

Coverage Gaps

Analysis of the current intervention portfolio reveals several areas where coverage is thin:

Gap Area	Current Investment	Risk Exposure	Recommended Action
Epistemic Risks	Under 5% of portfolio ($3-5M/year)	Epistemic collapse, reality fragmentation	Increase to 8-10% of portfolio; invest in content authentication and epistemic infrastructure
Long-term Structural Risks	4-6% of portfolio; international coordination is low tractability	Lock-in, concentration of power	Develop alternative coordination mechanisms; invest in governance research
Post-Incident Recovery	Under 1% of portfolio	All risk categories	Develop recovery protocols and resilience measures; allocate 3-5% of portfolio
Misuse by State Actors	Export controls are primary lever; $5-10M in policy research	Authoritarian tools, surveillance	Research additional governance mechanisms; increase to $15-25M
Independent Evaluation Capacity	70%+ of evals done by labs themselves	Conflict of interest, verification gaps	Coefficient Giving's eval RFP addresses this with $200K-$5M grants

Key Complementarities

Certain interventions work better together than in isolation:

Technical + Governance:

AI Evaluations inform Responsible Scaling Policies thresholds
Interpretability enables verification for
AI Control provides safety margin while governance matures

Near-term + Long-term:

Compute Governance buys time for Interpretability research
AI Evaluations identify near-term risks while Scalable Oversight develops
AI Safety Field Building and Community ensures capacity for future technical work

Prevention + Resilience:

Technical safety research aims to prevent failures
AI-Era Epistemic Security and economic resilience limit damage if prevention fails
Both are needed for robust defense-in-depth

Portfolio Funding Allocation

The following table estimates 2024 funding levels by intervention area and compares them to recommended allocations based on neglectedness and impact potential. Total external AI safety funding was approximately $110-130 million in 2024, with Coefficient Giving providing ~60% of this amount.

Intervention Area	Est. 2024 Funding	% of Total	Recommended Shift	Key Funders
RLHF/Training Methods	$15-35M	≈25%	Decrease to 20%	Frontier labs (internal), academic grants
Interpretability	$15-25M	≈18%	Maintain	Coefficient Giving, Superalignment Fast Grants ($10M)
Evaluations & Evals Infrastructure	$12-18M	≈13%	Increase to 20%	CAIS ($1.5M), UK AISI, labs
AI Control Research	$1-12M	≈9%	Increase to 15%	Redwood Research ($1.2M), Anthropic
Compute Governance	$1-10M	≈7%	Increase to 12%	Government programs, policy organizations
Field Building & Talent	$10-15M	≈11%	Maintain	80,000 Hours, MATS, various fellowships
Governance & Policy	$1-12M	≈9%	Increase to 12%	Coefficient Giving policy grants, government initiatives
International Coordination	$1-5M	≈4%	Increase to 8%	UK/EU government initiatives (≈$14M total)
Epistemic Resilience	$1-4M	≈3%	Increase to 8%	Very few dedicated funders

2025 Funding Landscape Update

Funder	2024 Allocation	Focus Areas	Source
Coefficient Giving	$63.6M	Technical safety, governance, field building	60% of external funding
Jaan Tallinn	$20M	Long-term alignment research	Personal foundation
Eric Schmidt (Schmidt Sciences)	$10M	Safety benchmarking, adversarial evaluation	Quick Market Pitch
AI Safety Fund	$10M	Collaborative research (Anthropic, Google, Microsoft, OpenAI)	Frontier Model Forum
Future of Life Institute	$5M	Smaller grants, fellowships	Diverse portfolio
Steven Schuurman Foundation	€5M/year	Various AI safety initiatives	Elastic co-founder
Total External	$110-130M	—	2024 estimate

2025 Trajectory: Early data (through July 2025) shows $67M already committed, putting the year on track to exceed 2024 totals by 40-50%.

Funding Gap Analysis

The funding landscape reveals several structural imbalances:

Gap Type	Current State	Impact	Recommended Action
Climate vs AI safety	Climate philanthropy: ≈$1-15B; AI safety: ≈$130M	≈100x disparity despite comparable catastrophic potential	Increase AI safety funding to at least $100M-1B annually
Capabilities vs safety	≈$100B in AI data center capex (2024) vs ≈$130M safety	≈1500:1 ratio	Redirect 0.5-1% of capabilities spending to safety
Funder concentration	Coefficient Giving: 60% of external funding	Single point of failure; limits diversity	Diversify funding sources; new initiatives like Humanity AI ($100M)
Talent pipeline	Over-optimized for researchers	Shortage in governance, operations, advocacy	Expand non-research talent programs

Resource Allocation Assessment

Current vs. Recommended Allocation

Area	Current Allocation	Recommended	Rationale
RLHF/Training	Very High	High	Deployed at scale but limited effectiveness against deceptive alignment
Interpretability	High	High	Rapid progress; potential for fundamental breakthroughs
Evaluations	High	Very High	Critical for identifying dangerous capabilities pre-deployment
AI Control	Medium	High	Near-term tractable; provides safety regardless of alignment
Compute Governance	Medium	High	One of few physical levers; already showing policy impact
International Coordination	Low	Medium	Low tractability but very high stakes
Epistemic Resilience	Very Low	Medium	Highly neglected; addresses underserved risk category
Field Building	Medium	Medium	Maintain current investment; returns are well-established

Investment Concentration Risks

The current portfolio shows several structural vulnerabilities:

Concentration Type	Current State	Risk	Mitigation
Funder concentration	Coefficient Giving provides ≈60% of external funding	Strategy changes affect entire field	Cultivate diverse funding sources
Geographic concentration	US and UK receive majority of funding	Limited global coordination capacity	Support emerging hubs (Berlin, Canada, Australia)
Frontier lab dependence	Most technical safety at Anthropic, OpenAI, DeepMind	Conflicts of interest; limited independent verification	Increase funding to MIRI ($1.1M), Redwood, ARC
Research over operations	Pipeline over-optimized for researchers	Shortage of governance, advocacy, operations talent	Expand non-research career paths
Technical over governance	Technical ~60% vs governance ≈15% of funding	Governance may be more neglected and tractable	Rebalance toward policy research
Prevention over resilience	Minimal investment in post-incident recovery	No fallback if prevention fails	Develop recovery protocols

Strategic Considerations

Worldview Dependencies

Different beliefs about AI risk lead to different portfolio recommendations:

Worldview	Prioritize	Deprioritize
Alignment is very hard	Interpretability, Control, International coordination	RLHF, Voluntary commitments
Misuse is the main risk	Compute governance, Content authentication, Legislation	Interpretability, Agent foundations
Short timelines	AI Control, Evaluations, Responsible scaling	Long-term governance research
Racing dynamics dominate	International coordination, Compute governance	Unilateral safety research
Epistemic collapse is likely	Epistemic security, Content authentication	Technical alignment

Portfolio Robustness

A robust portfolio should satisfy the following criteria, which can help evaluate current gaps and guide future allocation:

Robustness Criterion	Current Status	Gap Assessment	Target
Cover multiple failure modes	Accident risks: 60% coverage; Misuse: 50%; Structural: 30%; Epistemic: under 15%	Medium gap	70%+ coverage across all categories
Prevention and resilience	~95% prevention, ≈5% resilience	Large gap	80% prevention, 20% resilience
Near-term and long-term balance	55% near-term (evals, control), 45% long-term (interpretability, governance)	Small gap	Maintain current balance
Independent research capacity	Frontier labs: 70%+ of technical safety; Independents: under 30%	Medium gap	50/50 split between labs and independents
Support multiple worldviews	Most interventions robust across scenarios	Small gap	Maintain
Geographic diversity	US/UK: 80%+ of funding; EU: 10%; ROW: under 10%	Medium gap	US/UK: 60%, EU: 20%, ROW: 20%
Funder diversity	5 funders provide 85% of external funding; Coefficient Giving alone provides 60%	Large gap	No single funder greater than 25%

Key Sources

Source	Type	Relevance
Coefficient Giving Progress 2024	Funder Report	Primary data on AI safety funding levels and priorities
AI Safety Funding Situation Overview	Analysis	Comprehensive breakdown of funding sources and gaps
AI Safety Needs More Funders	Policy Brief	Comparison to other catastrophic risk funding
AI Safety Field Growth Analysis 2025	Research	Field growth metrics, 1,100 FTEs, 21% annual growth
International AI Safety Report 2025	Global Report	100+ authors, 30 countries, Yoshua Bengio lead
Future of Life AI Safety Index 2025	Industry Assessment	33 indicators across 6 domains for 7 leading companies
Coefficient Giving Technical AI Safety RFP	Grant Program	$40M allocation for technical safety research
Coefficient Giving Capability Evaluations RFP	Grant Program	$200K-$5M grants for evaluation infrastructure
America's AI Action Plan (July 2025)	Policy	US government AI priorities including evaluations ecosystem
Accelerating AI Interpretability (FAS)	Policy Brief	Federal funding recommendations for interpretability
80,000 Hours: AI Risk	Career Guidance	Intervention prioritization and neglectedness analysis
RLHF Limitations Paper	Research	Evidence on limitations of current alignment methods
Carnegie AI Safety as Global Public Good	Policy Analysis	International coordination challenges and research priorities
ITU Annual AI Governance Report 2025	Global Report	AI governance landscape across nations

References

1Open Philanthropy Request for Proposals: Technical AI Safety ResearchCoefficient Giving▸

Open Philanthropy issued a request for proposals seeking technical AI safety research projects, signaling funding priorities and research directions the organization considers most valuable. The RFP outlines areas of interest including interpretability, scalable oversight, and related alignment challenges, aiming to grow the field by supporting researchers and organizations working on these problems.

★★★★☆

openphilanthropy.org

2International AI Safety Report 2025internationalaisafetyreport.org▸

A landmark international scientific assessment co-authored by 96 experts from 30 countries, providing a comprehensive overview of general-purpose AI capabilities, risks, and risk management approaches. It aims to establish shared scientific understanding across nations as a foundation for global AI governance. The report covers topics including capability evaluation, misuse risks, systemic risks, and mitigation strategies.

internationalaisafetyreport.org

3AI Safety and Security Need More FundersCoefficient Giving▸

This research piece from Coefficient Giving argues that AI safety and security research is significantly underfunded relative to the risks involved, and makes the case for philanthropists and funders to increase financial support for the field. It examines funding gaps, highlights promising organizations and research areas, and encourages diversification of the funder base beyond a few major donors.

★★★★☆

coefficientgiving.org

4Open Philanthropy grants databaseCoefficient Giving▸

Open Philanthropy, recently rebranded as Coefficient Giving, is a major philanthropic organization that has directed over $5 billion in grants since 2014 across cause areas including AI safety, biosecurity, global health, and existential risk. Their 'Navigating Transformative AI' fund focuses on ensuring AI is safe and well-governed. The grants database provides transparency into which organizations and research directions receive funding.

★★★★☆

openphilanthropy.org

5AI Safety Fund (AISF) – Frontier Model ForumFrontier Model Forum▸

The AI Safety Fund (AISF) is a $10 million+ collaborative initiative launched in October 2023 by Anthropic, Google, Microsoft, and OpenAI (via the Frontier Model Forum) along with philanthropic partners to fund independent AI safety and security research. It has distributed two rounds of grants focused on responsible frontier AI development, public safety risk reduction, and standardized third-party capability evaluations. The fund is now directly managed by the Frontier Model Forum following the closure of its original administrator, the Meridian Institute.

★★★☆☆

frontiermodelforum.org

6Open Philanthropy: Progress in 2024 and Plans for 2025Coefficient Giving▸

Open Philanthropy reviews its 2024 philanthropic activities and outlines priorities for 2025, with emphasis on AI safety research funding, strategic partnerships, and grants spanning global health and catastrophic risk reduction. The report provides transparency into one of the field's largest funders and signals where major resources will flow in the AI safety ecosystem.

★★★★☆

openphilanthropy.org

7GovAI Research PublicationsCentre for the Governance of AI·Government▸

The Centre for the Governance of AI (GovAI) research hub aggregates policy-relevant technical and governance research on frontier AI systems, covering topics from biosecurity and cybercrime to labor market impacts and AI auditing. It serves as a comprehensive repository of GovAI's publications spanning multiple years and research themes. The page indexes papers addressing near-term and long-term risks from advanced AI systems.

★★★★☆

governance.ai

8Mechanistic interpretability: 10 Breakthrough Technologies 2026 | MIT Technology ReviewMIT Technology Review▸

MIT Technology Review highlights mechanistic interpretability as one of its top breakthrough technologies of 2026, summarizing progress by Anthropic, OpenAI, and Google DeepMind in mapping LLM internal features and tracing model reasoning pathways. The piece covers both sparse autoencoder-based feature mapping and chain-of-thought monitoring as complementary tools for understanding model behavior. It notes ongoing debate about whether LLMs will ever be fully interpretable.

★★★★☆

technologyreview.com

9METR: Model Evaluation and Threat ResearchMETR▸

METR is an organization conducting research and evaluations to assess the capabilities and risks of frontier AI systems, focusing on autonomous task completion, AI self-improvement risks, and evaluation integrity. They have developed the 'Time Horizon' metric measuring how long AI agents can autonomously complete software tasks, showing exponential growth over recent years. They work with major AI labs including OpenAI, Anthropic, and Amazon to evaluate catastrophic risk potential.

★★★★☆

metr.org

10NIST News: NIST Launches Centers for AI in Manufacturing and Critical InfrastructureNIST·Government▸

NIST awarded $20 million to MITRE Corporation to establish two AI Economic Security Centers focused on advancing AI in U.S. manufacturing productivity and protecting critical infrastructure from cyberthreats. The initiative implements recommendations from the White House's America's AI Action Plan and represents a public-private partnership model for accelerating AI development and deployment in national priority areas.

★★★★★

nist.gov

11It looks like there are some good funding opportunities in AI safety right now80,000 Hours▸

★★★☆☆

80000hours.org

12OpenAI Superalignment Fast GrantsOpenAI▸

OpenAI's Superalignment team announced a fast grants program to fund external researchers working on technical alignment and interpretability research, aiming to solve the problem of aligning superintelligent AI systems within four years. The program offers grants ranging from $100K to $2M to support academic labs, graduate students, and independent researchers. This reflects OpenAI's strategy of leveraging external talent to accelerate progress on their superalignment research agenda.

★★★★☆

openai.com

13Who is funding AI safety research?quickmarketpitch.com▸

quickmarketpitch.com

14Inside Philanthropy - AI Regulation FundingInside Philanthropy▸

An investigative journalism piece examining the philanthropic landscape funding AI regulation and safety efforts, identifying key donors, foundations, and grant recipients shaping the AI governance space. The article maps financial flows from major funders to policy organizations, research groups, and advocacy efforts focused on AI oversight.

★★★☆☆

insidephilanthropy.com

15AI Alignment through RLHFarXiv·Adam Dahlgren Lindström et al.·2024·Paper▸

This paper provides a critical sociotechnical analysis of Reinforcement Learning from Human Feedback (RLHF) and AI Feedback (RLAIF) as alignment approaches for large language models. The authors argue that while RLHF aims to achieve honesty, harmlessness, and helpfulness, these methods have significant theoretical and practical limitations in capturing the complexity of human ethics and ensuring genuine AI safety. The paper identifies inherent tensions in alignment goals and highlights neglected ethical issues, ultimately calling for a more nuanced and reflective approach to RLxF implementation in AI development.

★★★☆☆

arxiv.org

16FLI AI Safety Index Summer 2025Future of Life Institute▸

The Future of Life Institute's AI Safety Index Summer 2025 systematically evaluates leading AI companies on safety practices, finding widespread deficiencies across risk management, transparency, and existential safety planning. Anthropic receives the highest grade of C+, indicating that even the best-performing company falls significantly short of adequate safety standards. The report serves as a comparative benchmark for industry accountability.

★★★☆☆

futureoflife.org

1780,000 Hours. "Risks from Power-Seeking AI Systems"80,000 Hours▸

This 80,000 Hours problem profile argues that AI systems pursuing goals misaligned with human values could seek to accumulate power and resources in ways that permanently undermine human control. It outlines why this risk is among the most pressing long-term problems and explains the mechanisms by which advanced AI could pose catastrophic or existential threats. The piece serves as an accessible entry point into the case for prioritizing AI safety work.

★★★☆☆

80000hours.org

18ITU Annual AI Governance Report 2025itu.int▸

The ITU's 2025 AI Governance Report provides a comprehensive overview of global AI governance developments, frameworks, and policy trends from an international telecommunications and ICT standards perspective. It examines how nations and international bodies are approaching AI regulation, safety standards, and coordination challenges. The report serves as a reference document for policymakers and stakeholders navigating the evolving AI governance landscape.

itu.int

AI Safety Intervention Portfolio