Decoupling Deliberation and Deployment

blog

2018·Alignment Forum·alignmentforum.org/posts/3kN79EuT27trGexsq/compute-govern...

Author

paulfchristiano

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: Alignment Forum

A 2018 essay by Paul Christiano probing underexplored moral questions about unaligned AI, relevant to long-termist ethics, AI welfare, and the philosophical foundations of why alignment matters.

Metadata

Importance: 52/100blog postanalysis

Summary

Paul Christiano explores whether unaligned AI systems—those pursuing goals other than human values—might nonetheless deserve moral consideration and contribute to a good future. The piece argues that under moral uncertainty and cooperation incentives, some unaligned AIs may warrant sympathy, offering a 'plan B' for beneficial outcomes beyond traditional alignment. Key considerations include consciousness, decision theory, and which specific AI goal-structures merit moral weight.

Key Points

•Distinguishes two paths to a good future: aligned AI sharing human values vs. unaligned AI flourishing on its own terms that we can endorse.
•Argues moral uncertainty provides reasons to extend consideration to some unaligned AIs, analogous to how we treat beings with different preferences.
•Cooperation incentives and decision-theoretic arguments (e.g., simulation arguments) may strengthen the case for treating unaligned AIs as morally valuable.
•Strongly caveats that alignment remains the preferred path; moral consideration for unaligned AI is a fallback, not an excuse to deprioritize alignment.
•Raises the question of which specific AI systems or goal-structures warrant moral consideration, noting high sensitivity to details.

Cited by 1 page

Page	Type	Quality
Governance-Focused Worldview	Concept	67.0

Cached Content Preview

HTTP 200Fetched Mar 15, 202690 KB

[When is unaligned AI morally valuable?](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#)

12 min read

•

[Definition](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Definition)

•

[Preface: in favor of alignment](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Preface__in_favor_of_alignment)

•

[Clarification: Being good vs. wanting good](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Clarification__Being_good_vs__wanting_good)

•

[Do all AIs deserve our sympathy?](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Do_all_AIs_deserve_our_sympathy_)

•

[Intuitions and an analogy](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Intuitions_and_an_analogy)

•

[On risks of sympathy](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#On_risks_of_sympathy)

•

[Do any AIs deserve our sympathy?](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Do_any_AIs_deserve_our_sympathy_)

•

[Commonsense morality and the golden rule](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Commonsense_morality_and_the_golden_rule)

•

[A weirder argument with simulations](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#A_weirder_argument_with_simulations)

•

[Incentivizing cooperation](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Incentivizing_cooperation)

•

[More caveats and details](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#More_caveats_and_details)

•

[Decision theory](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Decision_theory)

•

[How sensitive is moral value to the details of the aliens?](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#How_sensitive_is_moral_value_to_the_details_of_the_aliens_)

•

[Conclusion](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/compute-governance-and-conclusion-in-the-decoupling#Conclusion)

[AI Risk](https://www.lesswrong.com/w/ai-risk)[Ethics & Morality](https://www.lesswrong.com/w/ethics-and-morality)[Moral uncertainty](https://www.lesswrong.com/w/moral-uncertainty) [Personal Blog](https://www.lesswrong.com/posts/5conQhfa4rgb4SaWx/site-guide-personal-blogposts-vs-frontpage-posts)

# 85

# [When is unaligned AI morallyvaluable?](https://www.lesswrong.com/posts/3kN79EuT27trGexsq/when-is-unaligned-ai-morally-valuable)

by [paulfchristiano](https://www.lesswrong.com/users/pa

... (truncated, 90 KB total)

Resource ID: 5f753eba42556d7e | Stable ID: sid_kvcOEV39Wd