The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents (Bostrom, 2012)

web

nickbostrom.com·nickbostrom.com/superintelligent-will.pdf

This 2012/2014 paper by Nick Bostrom is a cornerstone of AI safety theory, formalizing the Orthogonality and Instrumental Convergence theses that are widely cited across alignment research and form a conceptual foundation for Bostrom's book 'Superintelligence'.

Metadata

Importance: 92/100conference paperprimary source

Summary

Bostrom's paper introduces two foundational theses in AI safety: the Orthogonality Thesis (intelligence and goals are independent dimensions) and the Instrumental Convergence Thesis (sufficiently intelligent agents will tend toward common sub-goals like self-preservation and resource acquisition regardless of final goals). These concepts underpin much of contemporary AI alignment theory.

Key Points

•Orthogonality Thesis: any level of intelligence can in principle be combined with any final goal, refuting assumptions that smarter AI must have human-friendly values
•Instrumental Convergence Thesis: advanced agents with diverse final goals will converge on similar instrumental sub-goals (self-preservation, goal-content integrity, resource acquisition)
•Challenges the intuition that superintelligent AI would naturally become benevolent or adopt human values through reasoning alone
•Provides a theoretical basis for why misaligned superintelligent AI could be dangerous regardless of its specific programmed objectives
•Foundational framework for understanding why alignment cannot be assumed and must be explicitly engineered

Cited by 1 page

Page	Type	Quality
Instrumental Convergence Framework	Analysis	60.0

Cached Content Preview

HTTP 200Fetched Mar 15, 20260 KB

# Page Not Found

The page could not be found. Please check the URL or link that sent you here.

[Go back to the home page](https://nickbostrom.com/)

Resource ID: 07ea295d40f85602 | Stable ID: sid_3QLn822aMo