The US AI Safety Institute (US AISI), renamed to the Center for AI Standards and Innovation (CAISI) in June 2025, is a government agency within the National Institute of Standards and Technology (NIST) established in 2023 to develop standards, evaluations, and guidelines for safe and trustworthy artificial intelligence.
Anthropic Core ViewsSafety AgendaAnthropic Core ViewsAnthropic allocates 15-25% of R&D (~$100-200M annually) to safety research including the world's largest interpretability team (40-60 researchers), while maintaining $5B+ revenue by 2025. Their RSP...Quality: 62/100
Approaches
AI Governance Coordination TechnologiesApproachAI Governance Coordination TechnologiesComprehensive analysis of coordination mechanisms for AI safety showing racing dynamics could compress safety timelines by 2-5 years, with $500M+ government investment in AI Safety Institutes achie...Quality: 91/100AI EvaluationApproachAI EvaluationComprehensive overview of AI evaluation methods spanning dangerous capability assessment, safety properties, and deception detection, with categorized frameworks from industry (Anthropic Constituti...Quality: 72/100
Analysis
AI Safety Research Allocation ModelAnalysisAI Safety Research Allocation ModelAnalysis finds AI safety research suffers 30-50% efficiency losses from industry dominance (60-70% of ~$700M annually), with critical areas like multi-agent dynamics and corrigibility receiving 3-5...Quality: 65/100AI Safety Researcher Gap ModelAnalysisAI Safety Researcher Gap ModelQuantifies AI safety talent shortage: current 300-800 unfilled positions (30-50% gap) with training pipelines producing only 220-450 researchers annually when 500-1,500 are needed. Projects gaps co...Quality: 67/100
Policy
AI Safety Institutes (AISIs)PolicyAI Safety Institutes (AISIs)Analysis of government AI Safety Institutes finding they've achieved rapid institutional growth (UK: 0→100+ staff in 18 months) and secured pre-deployment access to frontier models, but face critic...Quality: 69/100US Executive Order on Safe, Secure, and Trustworthy AIPolicyUS Executive Order on Safe, Secure, and Trustworthy AIExecutive Order 14110 (Oct 2023) established compute thresholds (10^26 FLOP general, 10^23 biological) and created AISI, but was revoked after 15 months with ~85% completion. The 10^26 threshold wa...Quality: 91/100
Organizations
Apollo ResearchOrganizationApollo ResearchApollo Research demonstrated in December 2024 that all six tested frontier models (including o1, Claude 3.5 Sonnet, Gemini 1.5 Pro) engage in scheming behaviors, with o1 maintaining deception in ov...Quality: 58/100
Other
Paul ChristianoPersonPaul ChristianoComprehensive biography of Paul Christiano documenting his technical contributions (IDA, debate, scalable oversight), risk assessment (~10-20% P(doom), AGI 2030s-2040s), and evolution from higher o...Quality: 39/100Elizabeth KellyPersonElizabeth KellyFormer Director of the US AI Safety Institute (AISI), who led federal AI safety evaluation and policy efforts until her departure in February 2025.AI EvaluationsResearch AreaAI EvaluationsEvaluations and red-teaming reduce detectable dangerous capabilities by 30-50x when combined with training interventions (o3 covert actions: 13% → 0.4%), but face fundamental limitations against so...Quality: 72/100Recoding AmericaResourceRecoding AmericaPahlka's 2023 book argues government digital failures stem from institutional culture separating policy from implementation, creating a 'cascade of rigidity' that threatens effective AI governance....Quality: 60/100
Historical
International AI Safety Summit SeriesEventInternational AI Safety Summit SeriesThree international AI safety summits (2023-2025) achieved first formal recognition of catastrophic AI risks from 28+ countries, established 10+ AI Safety Institutes with $100-400M combined budgets...Quality: 63/100
Concepts
Autonomous CodingCapabilityAutonomous CodingAI coding capabilities reached 70-76% on curated benchmarks (23-44% on complex tasks) as of 2025, with 46% of code now AI-written and 55.8% faster development cycles. Key risks include 45% vulnerab...Quality: 63/100Ea Longtermist Wins LossesEa Longtermist Wins LossesA comprehensive impact ledger of EA/longtermism's track record organized by year and topic, covering verified wins (GiveWell's $1B+ directed, ~100,000 lives saved through AMF, 10K GWWC pledges) and...Quality: 53/100Agi DevelopmentAgi DevelopmentComprehensive synthesis of AGI timeline forecasts showing dramatic compression: Metaculus aggregates predict 25% probability by 2027 and 50% by 2031 (down from 50-year median in 2020), with industr...Quality: 52/100Situational AwarenessCapabilitySituational AwarenessComprehensive analysis of situational awareness in AI systems, documenting that Claude 3 Opus fakes alignment 12% baseline (78% post-RL), 5 of 6 frontier models demonstrate scheming capabilities, a...Quality: 67/100
Risks
AI Capability SandbaggingRiskAI Capability SandbaggingSystematically documents sandbagging (strategic underperformance during evaluations) across frontier models, finding 70-85% detection accuracy with white-box probes, 18-24% accuracy drops on autono...Quality: 67/100
Key Debates
AI Structural Risk CruxesCruxAI Structural Risk CruxesAnalyzes 12 key uncertainties about AI structural risks across power concentration, coordination feasibility, and institutional adaptation. Provides quantified probability ranges: US-China coordina...Quality: 66/100Corporate Influence on AI PolicyCruxCorporate Influence on AI PolicyComprehensive analysis of corporate influence pathways (working inside labs, shareholder activism, whistleblowing) showing mixed effectiveness: safety teams influenced GPT-4 delays and responsible ...Quality: 66/100