Forethought Foundation's analysis

web

forethought.org·forethought.org/research/agi-and-lock-in

Published by Forethought Foundation, this resource addresses lock-in risks from AGI — a concern prominent in longtermist AI safety discourse, relevant to researchers thinking about societal-scale impacts and governance structures for transformative AI.

Metadata

Importance: 52/100organizational reportanalysis

Summary

This Forethought Foundation research examines the risks of lock-in scenarios associated with the development of AGI, exploring how advanced AI systems could entrench particular power structures, values, or trajectories in ways that are difficult or impossible to reverse. The analysis likely covers mechanisms by which AGI development could foreclose future options and reduce humanity's long-run autonomy.

Key Points

•Examines how AGI development could lead to irreversible lock-in of specific values, power structures, or governance arrangements
•Explores the relationship between advanced AI capabilities and the concentration of control or influence
•Considers long-term civilizational risks beyond near-term safety concerns, focusing on path dependency
•Analyzes scenarios where AGI transitions reduce humanity's ability to course-correct or maintain meaningful agency
•Likely proposes frameworks or interventions to preserve optionality and prevent premature lock-in

Cited by 1 page

Page	Type	Quality
AI Structural Risk Cruxes	Crux	66.0

Cached Content Preview

HTTP 200Fetched Apr 5, 202698 KB

AGI and Lock-in AGI and Lock-in

 Lukas Finnveden Jess Riedel Carl Shulman Authors

 Citations

 Cite Citations

 PDF Contact 8th October 2022 Last update: 11th March 2025 AGI and Lock-in

 Abstract 0 Summary 0.0 The claim 0.1 Preserving information 0.2 Executing intentions 0.3 Preventing disruption 0.4 Some things we don’t argue for 0.5 Structure of the document 1 To be more precise 1.1 What do we assume about “AGI”? 1.2 What do we mean by “lock-in”? 1.3 How confident are we that stability is feasible, for how long? 2 Desirability and probability 2.1 Would it be good for highly stable institutions to be built? 2.2 How likely is this? 3 Past sources of instability 3.1 Foreign intervention 3.2 Aging and death 3.3 Technological or societal changes favoring new values 3.4 Internal rebellion 4 Preserving information, and baseline stability 4.1 Preserving complex goals 4.2 Digital error correction 4.3 Baseline stability via redundancy 5 Aligning with goals 5.1 What do we mean by “goals” 5.2 Alignment 6 Stability of goals 6.1 Why do humans’ goals drift? 6.2 Why might AGI goals drift? 6.3 Institutional drift 6.4 Interpreting goals 6.5 Aligning with that interpretation 6.6 Verifying loyalty 7 Robustness to natural disasters 7.1 Natural disasters 7.2 Astronomical stability 8 Robustness to non-aligned actors 8.1 In-principle feasibility 8.2 How much control is needed? 8.3 Alien civilisations References Abstract 

 The long-term future of intelligent life is currently unpredictable and undetermined. We argue that the invention of artificial general intelligence (AGI) could change this by making extreme types of lock-in technologically feasible. In particular, we argue that AGI would make it technologically feasible to (i) perfectly preserve nuanced specifications of a wide variety of values or goals far into the future, and (ii) develop AGI-based institutions that would (with high probability) competently pursue any such values for at least millions, and plausibly trillions, of years. 
 0 Summary 

 0.0 The claim 

 Life on Earth could survive for millions of years. Life in space could plausibly survive for trillions of years. What will happen to intelligent life during this time? Some possible claims are: 
 
 A) Humanity will almost certainly go extinct in the next million years. 

 B) Under Darwinian pressures, intelligent life will spread throughout the stars and rapidly evolve toward maximal reproductive fitness. 

 C) Through moral reflection, intelligent life will reliably be driven to pursue some specific “higher” (non-reproductive) goal, such as maximizing the happiness of all creatures. 

 D) The choices of intelligent life are deeply, fundamentally uncertain. It will at no point be predictable what intelligent beings will choose to do in the following 1000 years. 

 E) It is possible to stabilize many features of society for millions or trillions of years. But it is possible to stabilize them into many different shapes — so civilization’s long-ter

... (truncated, 98 KB total)

Resource ID: 0115b3047845750f | Stable ID: sid_Nm7kn31f2z