Page StatusResponseTable
Edited 3 days ago
20
QualityDraftQuality: 20/100Human-assigned rating of overall page quality, considering depth, accuracy, and completeness.Structure suggests 035
ImportanceReferenceImportance: 35/100How central this topic is to AI safety. Higher scores mean greater relevance to understanding or mitigating AI risk.0
Structure0/15Structure: 0/15Automated score based on measurable content features.Word count0/2Tables0/3Diagrams0/2Internal links0/2Citations0/3Prose ratio2/2Overview section0/10TablesData tables in the page0DiagramsCharts and visual diagrams0Internal LinksLinks to other wiki pages0FootnotesFootnote citations [^N] with sources0External LinksMarkdown links to outside URLs%0%Bullet RatioPercentage of content in bullet lists
Updated monthlyDue in 4 weeks
Summary
An interactive sortable table comparing 42 AI safety approaches on dimensions including safety uplift, capability uplift, net world safety, scalability to superintelligence, and differential progress. Includes grouped and unified views.
Issues1
QualityRated 20 but structure suggests 0 (overrated by 20 points)
Safety Approaches Table
Columns:|
Current research investment | Safety vs capability progress ratio | Recommended funding change | How much does this reduce catastrophic risk? | Does it make AI more capable? | Is the world safer with this? | Does it work as AI gets smarter? | Does it work against deceptive AI? | Works for superintelligent AI? | Current adoption level | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
Training & Alignment | $1B+/yr | CAPABILITY-DOMINANT | REDUCE | LOW-MEDIUM | DOMINANT | UNCLEAR | BREAKS | NONE | NO | UNIVERSAL | |
Training & Alignment | $50-200M/yr | CAPABILITY-LEANING | MAINTAIN | MEDIUM | SIGNIFICANT | UNCLEAR | PARTIAL | WEAK | UNLIKELY | WIDESPREAD | |
Training & Alignment | $5-20M/yr | SAFETY-LEANING | INCREASE | UNKNOWN | SOME | UNCLEAR | MAYBE | PARTIAL | MAYBE | EXPERIMENTAL | |
Training & Alignment | $100-500M/yr | BALANCED | MAINTAIN | MEDIUM | SIGNIFICANT | HELPFUL | PARTIAL | PARTIAL | UNLIKELY | WIDESPREAD | |
Training & Alignment | $10-50M/yr | SAFETY-LEANING | INCREASE | UNKNOWN | SOME | UNCLEAR | UNKNOWN | UNKNOWN | MAYBE | EXPERIMENTAL | |
Training & Alignment | $500M+/yr | CAPABILITY-DOMINANT | REDUCE | LOW | SIGNIFICANT | UNCLEAR | PARTIAL | NONE | NO | UNIVERSAL | |
Training & Alignment | $1-5M/yr | SAFETY-DOMINANT | INCREASE | MEDIUM | NEUTRAL | HELPFUL | UNKNOWN | PARTIAL | MAYBE | NONE | |
Training & Alignment | $10-30M/yr | SAFETY-LEANING | INCREASE | MEDIUM | SOME | HELPFUL | PARTIAL | WEAK | UNLIKELY | WIDESPREAD | |
Training & Alignment | $50-150M/yr | BALANCED | MAINTAIN | LOW-MEDIUM | SOME | HELPFUL | PARTIAL | NONE | NO | UNIVERSAL | |
Training & Alignment | $5-20M/yr | SAFETY-LEANING | INCREASE | MEDIUM | SOME | HELPFUL | UNKNOWN | PARTIAL | MAYBE | EXPERIMENTAL | |
Interpretability | $50-150M/yr | SAFETY-DOMINANT | PRIORITIZE | LOW (now) / HIGH (potential) | NEUTRAL | HELPFUL | UNKNOWN | STRONG (if works) | MAYBE | EXPERIMENTAL | |
Interpretability | $10-30M/yr | SAFETY-DOMINANT | INCREASE | LOW (now) | NEUTRAL | HELPFUL | PARTIAL | PARTIAL | UNKNOWN | EXPERIMENTAL | |
Interpretability | $5-20M/yr | SAFETY-LEANING | INCREASE | MEDIUM | SOME | HELPFUL | PARTIAL | PARTIAL | UNKNOWN | EXPERIMENTAL | |
Interpretability | $5-10M/yr | SAFETY-DOMINANT | MAINTAIN | LOW | NEUTRAL | HELPFUL | YES | PARTIAL | MAYBE | WIDESPREAD | |
Evaluation | $20-50M/yr | SAFETY-DOMINANT | INCREASE | MEDIUM | NEUTRAL | HELPFUL | PARTIAL | WEAK | UNLIKELY | WIDESPREAD | |
Evaluation | $50-200M/yr | BALANCED | MAINTAIN | LOW-MEDIUM | NEUTRAL | HELPFUL | PARTIAL | NONE | NO | UNIVERSAL | |
Evaluation | $10-30M/yr | SAFETY-DOMINANT | PRIORITIZE | MEDIUM | NEUTRAL | HELPFUL | UNKNOWN | WEAK | UNLIKELY | SOME | |
Evaluation | $10-30M/yr | SAFETY-DOMINANT | INCREASE | LOW-MEDIUM | NEUTRAL | HELPFUL | PARTIAL | WEAK | UNLIKELY | SOME | |
Evaluation | $5-15M/yr | SAFETY-DOMINANT | PRIORITIZE | MEDIUM-HIGH | TAX | HELPFUL | PARTIAL | PARTIAL | UNLIKELY | EXPERIMENTAL | |
Evaluation | $10-30M/yr | SAFETY-LEANING | INCREASE | MEDIUM | SOME | HELPFUL | PARTIAL | WEAK | NO | SOME | |
Evaluation | $5-15M/yr | SAFETY-DOMINANT | PRIORITIZE | HIGH (if works) | NEUTRAL | HELPFUL | UNKNOWN | UNKNOWN | UNKNOWN | EXPERIMENTAL | |
Architectural | $50-200M/yr | BALANCED | MAINTAIN | LOW | TAX | NEUTRAL | BREAKS | NONE | NO | UNIVERSAL | |
Architectural | (included in RLHF) | BALANCED | MAINTAIN | LOW-MEDIUM | TAX | NEUTRAL | BREAKS | NONE | NO | UNIVERSAL | |
Architectural | $20-50M/yr | SAFETY-LEANING | INCREASE | MEDIUM | TAX | HELPFUL | PARTIAL | PARTIAL | UNLIKELY | SOME | |
Architectural | $10-30M/yr | SAFETY-DOMINANT | INCREASE | MEDIUM | TAX | HELPFUL | PARTIAL | PARTIAL | PARTIAL | WIDESPREAD | |
Architectural | $10-30M/yr | SAFETY-DOMINANT | INCREASE | MEDIUM | NEUTRAL | HELPFUL | PARTIAL | WEAK | NO | SOME | |
Architectural | $10-30M/yr | SAFETY-LEANING | INCREASE | MEDIUM | TAX | HELPFUL | PARTIAL | WEAK | NO | SOME | |
Architectural | $20-50M/yr | SAFETY-LEANING | MAINTAIN | MEDIUM-HIGH | TAX | HELPFUL | YES | N/A | PARTIAL | WIDESPREAD | |
Governance | $5-20M/yr | SAFETY-DOMINANT | PRIORITIZE | MEDIUM-HIGH | NEGATIVE | HELPFUL | YES | N/A | PARTIAL | SOME | |
Governance | $5-15M/yr | SAFETY-DOMINANT | INCREASE | MEDIUM | NEUTRAL | HELPFUL | UNKNOWN | PARTIAL | UNLIKELY | SOME | |
Governance | $10-30M/yr | SAFETY-DOMINANT | INCREASE | MEDIUM | TAX | HELPFUL | PARTIAL | WEAK | NO | SOME | |
Governance | $5-15M/yr | SAFETY-DOMINANT | INCREASE | LOW-MEDIUM | TAX | HELPFUL | YES | N/A | PARTIAL | EXPERIMENTAL | |
Governance | $1-5M/yr | SAFETY-DOMINANT | MAINTAIN | HIGH (if implemented) | NEGATIVE | UNCLEAR | UNKNOWN | N/A | YES (if works) | NONE | |
Governance | $10-30M/yr | SAFETY-DOMINANT | PRIORITIZE | MEDIUM-HIGH | TAX | HELPFUL | PARTIAL | N/A | PARTIAL | EXPERIMENTAL | |
Theoretical | $5-20M/yr | SAFETY-DOMINANT | INCREASE | HIGH (if achievable) | TAX | HELPFUL | UNKNOWN | STRONG (if works) | MAYBE | NONE | |
Theoretical | $10-50M/yr | SAFETY-DOMINANT | INCREASE | CRITICAL (if works) | TAX | HELPFUL | UNKNOWN | STRONG (by design) | YES (if works) | NONE | |
Theoretical | $1-5M/yr | SAFETY-DOMINANT | PRIORITIZE | HIGH (if solved) | NEUTRAL | HELPFUL | UNKNOWN | PARTIAL | MAYBE | NONE | |
Theoretical | $5-20M/yr | BALANCED | INCREASE | MEDIUM | SOME | HELPFUL | PARTIAL | N/A | UNKNOWN | EXPERIMENTAL | |
Theoretical | $5-15M/yr | SAFETY-LEANING | PRIORITIZE | HIGH (if solved) | SOME | HELPFUL | UNKNOWN | STRONG (if solved) | MAYBE | NONE | |
Theoretical | $5-20M/yr | SAFETY-DOMINANT | INCREASE | HIGH (if works) | NEGATIVE | HELPFUL | UNKNOWN | WEAK | UNLIKELY | EXPERIMENTAL | |
Theoretical | $10-30M/yr | SAFETY-DOMINANT | PRIORITIZE | HIGH | TAX | HELPFUL | UNKNOWN | PARTIAL | CRITICAL QUESTION | SOME |