Skip to content
Longterm Wiki
Back

MIRI announces new "Death With Dignity" strategy

web

Author

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: LessWrong

A widely-read and controversial 2022 post from Eliezer Yudkowsky representing a notably pessimistic shift in MIRI's public stance, often cited in discussions of AI doom timelines and the psychological/strategic framing of AI safety work.

Forum Post Details

Karma
381
Comments
547
Forum
lesswrong
Forum Tags
Machine Intelligence Research Institute (MIRI)Information HazardsAI Risk

Metadata

Importance: 72/100blog postcommentary

Summary

Eliezer Yudkowsky argues in April 2022 that humanity is extremely unlikely to solve AI alignment before advanced AI causes an existential catastrophe. Rather than abandoning work entirely, he proposes reframing AI safety efforts as helping humanity 'die with dignity'—doing work that at least creates a historical record of genuine effort, even if survival is deemed nearly impossible.

Key Points

  • Yudkowsky expresses deep pessimism that alignment will be solved in time, placing survival probability near 0%.
  • He proposes 'death with dignity' as an emotional reframe: continuing safety work to improve humanity's historical record rather than to guarantee survival.
  • Critiques existing approaches including Paul Christiano's schemes and Chris Olah's interpretability work as insufficient given real-world timelines and constraints.
  • Argues that transparency and interpretability work still has value if it enables even a failed attempt to warn decision-makers of existential danger.
  • Reflects MIRI's institutional shift toward pessimism about transformative AI safety outcomes and away from optimistic alignment research roadmaps.

Cited by 1 page

PageTypeQuality
AI Value Lock-inRisk64.0

Cached Content Preview

HTTP 200Fetched Mar 15, 202698 KB
x This website requires javascript to properly function. Consider activating javascript to get access to all site functionality. MIRI announces new "Death With Dignity" strategy — LessWrong Best of LessWrong 2022 Machine Intelligence Research Institute (MIRI) Information Hazards AI Risk Frontpage 381

 MIRI announces new "Death With Dignity" strategy 

 by Eliezer Yudkowsky 2nd Apr 2022 21 min read 547 381

 tl;dr:  It's obvious at this point that humanity isn't going to solve the alignment problem, or even try very hard, or even go out with much of a fight.  Since survival is unattainable, we should shift the focus of our efforts to helping humanity die with with slightly more dignity.

 Well, let's be frank here.  MIRI didn't solve AGI alignment and at least knows that it didn't.  Paul Christiano's incredibly complicated schemes have no chance of working in real life before DeepMind destroys the world.  Chris Olah's transparency work, at current rates of progress, will at best let somebody at DeepMind give a highly speculative warning about how the current set of enormous inscrutable tensors, inside a system that was recompiled three weeks ago and has now been training by gradient descent for 20 days, might possibly be planning to start trying to deceive its operators.

 Management will then ask what they're supposed to do about that.

 Whoever detected the warning sign will say that there isn't anything known they can do about that.  Just because you can see the system might be planning to kill you, doesn't mean that there's any known way to build a system that won't do that.  Management will then decide not to shut down the project - because it's not certain that the intention was really there or that the AGI will really follow through, because other AGI projects are hard on their heels, because if all those gloomy prophecies are true then there's nothing anybody can do about it anyways.  Pretty soon that troublesome error signal will vanish.

 When Earth's prospects are that far underwater in the basement of the logistic success curve , it may be hard to feel motivated about continuing to fight, since doubling our chances of survival will only take them from 0% to 0%.

 That's why I would suggest reframing the problem - especially on an emotional level - to helping humanity die with dignity, or rather, since even this goal is realistically unattainable at this point, die with slightly more dignity than would otherwise be counterfactually obtained. 

 Consider the world if Chris Olah had never existed.  It's then much more likely that nobody will even try and fail to adapt Olah's methodologies to try and read complicated facts about internal intentions and future plans, out of whatever enormous inscrutable tensors are being integrated a million times per second, inside of whatever recently designed system finished training 48 hours ago, in a vast GPU far

... (truncated, 98 KB total)
Resource ID: 79b5b7f6113c8a6c | Stable ID: N2YwNjFmNG