Presents two core cruxes in the AI x-risk debate: whether advanced AI would develop dangerous goals (instrumental convergence vs. trainable safety) and whether we'll get warning signs (gradual failures vs. deception/fast takeoff). No quantitative analysis, primary sources, or novel framing provided.
Is AI Existential Risk Real?
Is AI Existential Risk Real?
Presents two core cruxes in the AI x-risk debate: whether advanced AI would develop dangerous goals (instrumental convergence vs. trainable safety) and whether we'll get warning signs (gradual failures vs. deception/fast takeoff). No quantitative analysis, primary sources, or novel framing provided.
AI Existential Risk Debate
This is the foundational question in AI safety. Everything else depends on whether you believe AI could actually pose existential risk.
Key Cruxes
What would change your mind on this debate?
Key Questions
- ?If we built human-level AI, would it naturally develop dangerous goals?Yes - instrumental convergence applies
Power-seeking emerges from almost any goal. Training won't reliably prevent it.
→ X-risk is real; alignment is critical
Confidence: mediumNo - we can train safe systemsGoals come from training. We can instill safe goals and verify them.
→ X-risk is manageable with standard safety engineering
Confidence: medium - ?Will we get warning signs before catastrophe?Yes - problems will be visible first
Weaker systems will fail in detectable ways. We can iterate to safety.
→ Can learn from experience; less urgent
Confidence: lowNo - deception or fast takeoff prevents warningSufficiently capable AI might hide misalignment. Jump to dangerous capability.
→ Must solve alignment before building dangerous AI
Confidence: medium