Skip to content
Longterm Wiki
Back

Decoupling Deliberation and Deployment

blog

Author

paulfchristiano

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: Alignment Forum

A 2018 essay by Paul Christiano probing underexplored moral questions about unaligned AI, relevant to long-termist ethics, AI welfare, and the philosophical foundations of why alignment matters.

Metadata

Importance: 52/100blog postanalysis

Summary

Paul Christiano explores whether unaligned AI systems—those pursuing goals other than human values—might nonetheless deserve moral consideration and contribute to a good future. The piece argues that under moral uncertainty and cooperation incentives, some unaligned AIs may warrant sympathy, offering a 'plan B' for beneficial outcomes beyond traditional alignment. Key considerations include consciousness, decision theory, and which specific AI goal-structures merit moral weight.

Key Points

  • Distinguishes two paths to a good future: aligned AI sharing human values vs. unaligned AI flourishing on its own terms that we can endorse.
  • Argues moral uncertainty provides reasons to extend consideration to some unaligned AIs, analogous to how we treat beings with different preferences.
  • Cooperation incentives and decision-theoretic arguments (e.g., simulation arguments) may strengthen the case for treating unaligned AIs as morally valuable.
  • Strongly caveats that alignment remains the preferred path; moral consideration for unaligned AI is a fallback, not an excuse to deprioritize alignment.
  • Raises the question of which specific AI systems or goal-structures warrant moral consideration, noting high sensitivity to details.

Cited by 1 page

PageTypeQuality
Governance-Focused WorldviewConcept67.0

Cached Content Preview

HTTP 200Fetched Mar 15, 202690 KB
[When is unaligned AI morally valuable?](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#)

12 min read

•

[Definition](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Definition)

•

[Preface: in favor of alignment](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Preface__in_favor_of_alignment)

•

[Clarification: Being good vs. wanting good](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Clarification__Being_good_vs__wanting_good)

•

[Do all AIs deserve our sympathy?](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Do_all_AIs_deserve_our_sympathy_)

•

[Intuitions and an analogy](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Intuitions_and_an_analogy)

•

[On risks of sympathy](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#On_risks_of_sympathy)

•

[Do any AIs deserve our sympathy?](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Do_any_AIs_deserve_our_sympathy_)

•

[Commonsense morality and the golden rule](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Commonsense_morality_and_the_golden_rule)

•

[A weirder argument with simulations](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#A_weirder_argument_with_simulations)

•

[Incentivizing cooperation](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Incentivizing_cooperation)

•

[More caveats and details](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#More_caveats_and_details)

•

[Decision theory](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Decision_theory)

•

[How sensitive is moral value to the details of the aliens?](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#How_sensitive_is_moral_value_to_the_details_of_the_aliens_)

•

[Conclusion](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Conclusion)

[AI Risk](https://www.lesswrong.com/w/ai-risk)[Ethics & Morality](https://www.lesswrong.com/w/ethics-and-morality)[Moral uncertainty](https://www.lesswrong.com/w/moral-uncertainty) [Personal Blog](https://www.lesswrong.com/posts/5conQhfa4rgb4SaWx/site-guide-personal-blogposts-vs-frontpage-posts)

# 85

# [When is unaligned AI morallyvaluable?](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/when-is-unaligned-ai-morally-valuable)

by [paulfchristiano](https://www.lesswrong.com/users/pa

... (truncated, 90 KB total)
Resource ID: 5f753eba42556d7e | Stable ID: MWUyOTkzNz