Skip to content
Longterm Wiki
Back

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: Alignment Forum

A key MIRI-affiliated research sequence; essential reading for understanding agent-foundations approaches to alignment, particularly the theoretical gaps in classical agent models that motivate much of MIRI's and Redwood's technical research agendas.

Metadata

Importance: 82/100blog postprimary source

Summary

A foundational sequence by Scott Garrabrant and Abram Demski examining the deep theoretical challenges that arise when AI agents are embedded within—rather than external to—the environments they reason about. It addresses decision theory, world-modeling, and alignment under the realistic condition that an agent is itself a physical subsystem of the world it must model and act upon.

Key Points

  • Challenges classical AI agent models that assume a clean separation between the agent and its environment, arguing this separation breaks down for real-world AI systems.
  • Explores decision theory for embedded agents, including issues of self-reference, logical uncertainty, and how agents should reason about their own causal role.
  • Addresses 'embedded world-models': how an agent can form accurate models of a world it is physically part of, including modeling itself.
  • Covers robust delegation and subsystem alignment—how to ensure sub-components of an AI system remain aligned with the broader system's goals.
  • Frames these challenges as prerequisites for solving alignment, arguing that standard frameworks (e.g., AIXI) fail to address them adequately.

Cited by 3 pages

Cached Content Preview

HTTP 200Fetched Mar 15, 20262 KB
![](https://res.cloudinary.com/lesswrong-2-0/image/upload/c_fill,dpr_1.0,g_custom,h_380,q_auto,w_1919/v1/sequences/eoxmiqgdxndzkbhsws1z)

# Embedded Agency

This is a sequence by Scott Garrabrant and Abram Demski on one current way of thinking about alignment: Embedded Agency.

Full-Text Version

54[Embedded Agency (full-text version)](https://www.alignmentforum.org/s/Rm6oQRJJmhGCcLvxh/p/i3BTagvt3HbPMx6PN)

[Scott Garrabrant](https://www.alignmentforum.org/users/scott-garrabrant), [abramdemski](https://www.alignmentforum.org/users/abramdemski)
7y

4

46[Embedded Agents](https://www.alignmentforum.org/s/Rm6oQRJJmhGCcLvxh/p/p7x32SEt43ZMC9r7r)

[abramdemski](https://www.alignmentforum.org/users/abramdemski), [Scott Garrabrant](https://www.alignmentforum.org/users/scott-garrabrant)
7y

7

34[Decision Theory](https://www.alignmentforum.org/s/Rm6oQRJJmhGCcLvxh/p/zcPLNNw4wgBX5k8kQ)

[abramdemski](https://www.alignmentforum.org/users/abramdemski), [Scott Garrabrant](https://www.alignmentforum.org/users/scott-garrabrant)
7y

14

25[Embedded World-Models](https://www.alignmentforum.org/s/Rm6oQRJJmhGCcLvxh/p/efWfvrWLgJmbBAs3m)

[abramdemski](https://www.alignmentforum.org/users/abramdemski), [Scott Garrabrant](https://www.alignmentforum.org/users/scott-garrabrant)
7y

5

33[Robust Delegation](https://www.alignmentforum.org/s/Rm6oQRJJmhGCcLvxh/p/iTpLAaPamcKyjmbFC)

[abramdemski](https://www.alignmentforum.org/users/abramdemski), [Scott Garrabrant](https://www.alignmentforum.org/users/scott-garrabrant)
7y

2

28[Subsystem Alignment](https://www.alignmentforum.org/s/Rm6oQRJJmhGCcLvxh/p/ChierESmenTtCQqZy)

[abramdemski](https://www.alignmentforum.org/users/abramdemski), [Scott Garrabrant](https://www.alignmentforum.org/users/scott-garrabrant)
7y

3

26[Embedded Curiosities](https://www.alignmentforum.org/s/Rm6oQRJJmhGCcLvxh/p/j9CbmSsnprxB2uFY9)

[Scott Garrabrant](https://www.alignmentforum.org/users/scott-garrabrant), [abramdemski](https://www.alignmentforum.org/users/abramdemski)
7y

0

x

Embedded Agency — AI Alignment Forum

reCAPTCHA

Recaptcha requires verification.

[Privacy](https://www.google.com/intl/en/policies/privacy/) \- [Terms](https://www.google.com/intl/en/policies/terms/)

protected by **reCAPTCHA**

[Privacy](https://www.google.com/intl/en/policies/privacy/) \- [Terms](https://www.google.com/intl/en/policies/terms/)
Resource ID: bbc4bc9c2577c2d0 | Stable ID: NGZiNWY2NW