Buck Shlegeris on controlling AI that wants to take over - 80000 Hours

web

80,000 Hours·80000hours.org/podcast/episodes/buck-shlegeris-ai-control...

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: 80,000 Hours

Metadata

Cited by 1 page

Page	Type	Quality
Sleeper Agent Detection	Approach	66.0

Cached Content Preview

HTTP 200Fetched May 10, 202698 KB

## On this page:

- [Introduction](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#top)
- [1 Highlights](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#highlights)
- [2 Articles, books, and other media discussed in the show](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#articles-books-and-other-media-discussed-in-the-show)
- [3 Transcript](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#transcript)
  - [3.1 Cold open \[00:00:00\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#cold-open-000000)
  - [3.2 Who's Buck Shlegeris? \[00:01:27\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#whos-buck-shlegeris-000127)
  - [3.3 What's AI control? \[00:01:51\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#whats-ai-control-000151)
  - [3.4 Why is AI control hot now? \[00:05:39\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#why-is-ai-control-hot-now-000539)
  - [3.5 Detecting human vs AI spies \[00:10:32\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#detecting-human-vs-ai-spies-001032)
  - [3.6 Acute vs chronic AI betrayal \[00:15:21\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#acute-vs-chronic-ai-betrayal-001521)
  - [3.7 How to catch AIs trying to escape \[00:17:48\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#how-to-catch-ais-trying-to-escape-001748)
  - [3.8 The cheapest AI control techniques \[00:32:48\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#the-cheapest-ai-control-techniques-003248)
  - [3.9 Can we get untrusted models to do trusted work? \[00:38:58\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#can-we-get-untrusted-models-to-do-trusted-work-003858)
  - [3.10 If we catch a model escaping... will we do anything? \[00:50:15\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#if-we-catch-a-model-escaping-will-we-do-anything-005015)
  - [3.11 Getting AI models to think they've already escaped \[00:52:51\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#getting-ai-models-to-think-theyve-already-escaped-005251)
  - [3.12 Will they be able to tell it's a setup? \[00:58:11\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#will-they-be-able-to-tell-its-a-setup-005811)
  - [3.13 Will AI companies do any of this stuff? \[01:00:11\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#will-ai-companies-do-any-of-this-stuff-010011)
  - [3.14 Can we just give AIs fewer permissions? \[01:06:14\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#can-we-just-give-ais-fewer-permissions-010614)
  - [3.15 Can we stop human spies the same 

... (truncated, 98 KB total)

Resource ID: 8ec0801f7e9420fb | Stable ID: sid_0CVVwwXSsQ