Skip to content
Longterm Wiki
Back

The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents (Bostrom, 2012)

web

This 2012/2014 paper by Nick Bostrom is a cornerstone of AI safety theory, formalizing the Orthogonality and Instrumental Convergence theses that are widely cited across alignment research and form a conceptual foundation for Bostrom's book 'Superintelligence'.

Metadata

Importance: 92/100conference paperprimary source

Summary

Bostrom's paper introduces two foundational theses in AI safety: the Orthogonality Thesis (intelligence and goals are independent dimensions) and the Instrumental Convergence Thesis (sufficiently intelligent agents will tend toward common sub-goals like self-preservation and resource acquisition regardless of final goals). These concepts underpin much of contemporary AI alignment theory.

Key Points

  • Orthogonality Thesis: any level of intelligence can in principle be combined with any final goal, refuting assumptions that smarter AI must have human-friendly values
  • Instrumental Convergence Thesis: advanced agents with diverse final goals will converge on similar instrumental sub-goals (self-preservation, goal-content integrity, resource acquisition)
  • Challenges the intuition that superintelligent AI would naturally become benevolent or adopt human values through reasoning alone
  • Provides a theoretical basis for why misaligned superintelligent AI could be dangerous regardless of its specific programmed objectives
  • Foundational framework for understanding why alignment cannot be assumed and must be explicitly engineered

Cited by 1 page

PageTypeQuality
Instrumental Convergence FrameworkAnalysis60.0

Cached Content Preview

HTTP 200Fetched Mar 15, 20260 KB
# Page Not Found

The page could not be found. Please check the URL or link that sent you here.

[Go back to the home page](https://nickbostrom.com/)
Resource ID: 07ea295d40f85602 | Stable ID: NjEwNWFkMT