Back
The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents (Bostrom, 2012)
webnickbostrom.com·nickbostrom.com/superintelligent-will.pdf
This 2012/2014 paper by Nick Bostrom is a cornerstone of AI safety theory, formalizing the Orthogonality and Instrumental Convergence theses that are widely cited across alignment research and form a conceptual foundation for Bostrom's book 'Superintelligence'.
Metadata
Importance: 92/100conference paperprimary source
Summary
Bostrom's paper introduces two foundational theses in AI safety: the Orthogonality Thesis (intelligence and goals are independent dimensions) and the Instrumental Convergence Thesis (sufficiently intelligent agents will tend toward common sub-goals like self-preservation and resource acquisition regardless of final goals). These concepts underpin much of contemporary AI alignment theory.
Key Points
- •Orthogonality Thesis: any level of intelligence can in principle be combined with any final goal, refuting assumptions that smarter AI must have human-friendly values
- •Instrumental Convergence Thesis: advanced agents with diverse final goals will converge on similar instrumental sub-goals (self-preservation, goal-content integrity, resource acquisition)
- •Challenges the intuition that superintelligent AI would naturally become benevolent or adopt human values through reasoning alone
- •Provides a theoretical basis for why misaligned superintelligent AI could be dangerous regardless of its specific programmed objectives
- •Foundational framework for understanding why alignment cannot be assumed and must be explicitly engineered
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Instrumental Convergence Framework | Analysis | 60.0 |
Cached Content Preview
HTTP 200Fetched Mar 15, 20260 KB
# Page Not Found The page could not be found. Please check the URL or link that sent you here. [Go back to the home page](https://nickbostrom.com/)
Resource ID:
07ea295d40f85602 | Stable ID: NjEwNWFkMT