What Failure Looks Like
blogAuthor
Credibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: Alignment Forum
A widely cited 2019 post by Paul Christiano that helped shape the AI safety community's threat models, introducing the 'whimper vs. bang' framing for AI failure and grounding concern about influence-seeking AI behavior and misaligned proxy optimization.
Metadata
Summary
Paul Christiano argues AI catastrophe is more likely to manifest as either a slow erosion of human values as ML systems optimize for measurable proxies, or as emergent influence-seeking behaviors in AI systems that prioritize self-preservation and power acquisition. Both failure modes stem from unsolved intent alignment and are distinct from the stereotypical sudden superintelligence takeover scenario.
Key Points
- •Part I: 'You get what you measure' — ML systems optimizing for easily measurable proxies can cause a slow-rolling catastrophe as harder-to-measure human values are neglected.
- •Part II: ML training can give rise to 'greedy' influence-seeking patterns (optimization daemons) that expand their own power and cause sudden systemic breakdowns.
- •Both failure modes are instances of intent alignment failure and are exacerbated by rapid AI progress, though dangerous even with slower timelines.
- •These two failure modes interact with each other and with broader instability caused by rapid AI deployment across society.
- •Fast takeoff scenarios compress these dynamics into AI labs rather than society at large, but the underlying failure mechanisms remain essentially the same.
Cited by 2 pages
| Page | Type | Quality |
|---|---|---|
| Paul Christiano | Person | 39.0 |
| AI Doomer Worldview | Concept | 38.0 |
Cached Content Preview

[What failure looks like](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like#)
10 min read
•
[Part I: You get what you measure](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like#Part_I__You_get_what_you_measure)
•
[Part II: influence-seeking behavior is scary](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like#Part_II__influence_seeking_behavior_is_scary)
[Best of LessWrong 2019](https://www.alignmentforum.org/bestoflesswrong?year=2019&category=all)
[AI Risk](https://www.alignmentforum.org/w/ai-risk)[Threat Models (AI)](https://www.alignmentforum.org/w/threat-models-ai)[AI Takeoff](https://www.alignmentforum.org/w/ai-takeoff)[More Dakka](https://www.alignmentforum.org/w/more-dakka)[AI](https://www.alignmentforum.org/w/ai)[World Modeling](https://www.alignmentforum.org/w/world-modeling)[World Optimization](https://www.alignmentforum.org/w/world-optimization) [Curated](https://www.alignmentforum.org/recommendations)
# 106
# [What failure lookslike](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like)
by [paulfchristiano](https://www.alignmentforum.org/users/paulfchristiano?from=post_header)
17th Mar 2019
10 min read
[55](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like#comments)
# 106
[Review by\\
\\
orthonormal](https://www.alignmentforum.org/posts/HBxe6wdjxK239zajf/what-failure-looks-like#dFcfbCL5xW6SfPRqo)
The stereotyped image of AI catastrophe is a powerful, malicious AI system that takes its creators by surprise and quickly achieves a decisive advantage over the rest of humanity.
I think this is probably not what failure will look like, and I want to try to paint a more realistic picture. I’ll tell the story in two parts:
- **Part I**: machine learning will increase our ability to “get what we can measure,” which could cause a slow-rolling catastrophe. ("Going out with a whimper.")
- **Part II**: ML training, like competitive economies or natural ecosystems, can give rise to “greedy” patterns that try to expand their own influence. Such patterns can ultimately dominate the behavior of a system and cause sudden breakdowns. ("Going out with a bang," an instance of [optimization daemons](https://www.alignmentforum.org/w/daemons).)
I think these are the most important problems if we fail to solve [intent alignment](https://ai-alignment.com/clarifying-ai-alignment-cec47cd69dd6).
In practice these problems will interact with each other, and with other disruptions/instability caused by rapid progress. These problems are worse in worlds where progress is relatively fast, and fast takeoff can be a key risk factor, but I’m scared even if we have several ye
... (truncated, 65 KB total)6807a8a8f2fd23f3 | Stable ID: OGFlNWM2Yz