Consensus document from the 2025 Singapore Conference on AI (SCAI), authored by 88 researchers from 11 countries, organizing AI safety research into a defence-in-depth framework across three areas: Assessment, Development, and Control.
Frontier Model ForumOrganizationFrontier Model ForumThe Frontier Model Forum represents the AI industry's primary self-governance initiative for frontier AI safety, establishing frameworks and funding research, but faces fundamental criticisms about...Quality: 58/100
Analysis
AI Safety Research Allocation ModelAnalysisAI Safety Research Allocation ModelAnalysis finds AI safety research suffers 30-50% efficiency losses from industry dominance (60-70% of ~$700M annually), with critical areas like multi-agent dynamics and corrigibility receiving 3-5...Quality: 65/100AI Safety Research Value ModelAnalysisAI Safety Research Value ModelEconomic model analyzing AI safety research returns, recommending 3-10x funding increases from current ~$500M/year to $2-5B, with highest marginal returns (5-10x) in alignment theory and governance...Quality: 60/100AI Safety Intervention Effectiveness MatrixAnalysisAI Safety Intervention Effectiveness MatrixQuantitative analysis mapping 15+ AI safety interventions to specific risks reveals critical misallocation: 40% of 2024 funding ($400M+) flows to RLHF methods showing only 10-20% effectiveness agai...Quality: 73/100Power-Seeking Emergence Conditions ModelAnalysisPower-Seeking Emergence Conditions ModelFormal decomposition of power-seeking emergence into six quantified conditions, estimating current systems at 6.4% probability rising to 22% (2-4 years) and 36.5% (5-10 years). Provides concrete mi...Quality: 63/100
Other
InterpretabilityResearch AreaInterpretabilityMechanistic interpretability has extracted 34M+ interpretable features from Claude 3 Sonnet with 90% automated labeling accuracy and demonstrated 75-85% success in causal validation, though less th...Quality: 66/100Yoshua BengioPersonYoshua BengioComprehensive biographical overview of Yoshua Bengio's transition from deep learning pioneer (Turing Award 2018) to AI safety advocate, documenting his 2020 pivot at Mila toward safety research, co...Quality: 39/100Stuart RussellPersonStuart RussellStuart Russell (born 1962) is a British computer scientist and UC Berkeley professor who co-authored the dominant AI textbook 'Artificial Intelligence: A Modern Approach' (used in over 1,500 univer...Quality: 30/100AI ControlResearch AreaAI ControlAI Control is a defensive safety approach that maintains control over potentially misaligned AI through monitoring, containment, and redundancy, offering 40-60% catastrophic risk reduction if align...Quality: 75/100
Concepts
Self-Improvement and Recursive EnhancementCapabilitySelf-Improvement and Recursive EnhancementComprehensive analysis of AI self-improvement from current AutoML systems (23% training speedups via AlphaEvolve) to theoretical intelligence explosion scenarios, with expert consensus at ~50% prob...Quality: 69/100