Buck Shlegeris on controlling AI that wants to take over - 80000 Hours
webCredibility Rating
3/5
Good(3)Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: 80,000 Hours
Metadata
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Sleeper Agent Detection | Approach | 66.0 |
Cached Content Preview
HTTP 200Fetched May 10, 202698 KB
## On this page:
- [Introduction](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#top)
- [1 Highlights](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#highlights)
- [2 Articles, books, and other media discussed in the show](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#articles-books-and-other-media-discussed-in-the-show)
- [3 Transcript](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#transcript)
- [3.1 Cold open \[00:00:00\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#cold-open-000000)
- [3.2 Who's Buck Shlegeris? \[00:01:27\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#whos-buck-shlegeris-000127)
- [3.3 What's AI control? \[00:01:51\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#whats-ai-control-000151)
- [3.4 Why is AI control hot now? \[00:05:39\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#why-is-ai-control-hot-now-000539)
- [3.5 Detecting human vs AI spies \[00:10:32\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#detecting-human-vs-ai-spies-001032)
- [3.6 Acute vs chronic AI betrayal \[00:15:21\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#acute-vs-chronic-ai-betrayal-001521)
- [3.7 How to catch AIs trying to escape \[00:17:48\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#how-to-catch-ais-trying-to-escape-001748)
- [3.8 The cheapest AI control techniques \[00:32:48\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#the-cheapest-ai-control-techniques-003248)
- [3.9 Can we get untrusted models to do trusted work? \[00:38:58\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#can-we-get-untrusted-models-to-do-trusted-work-003858)
- [3.10 If we catch a model escaping... will we do anything? \[00:50:15\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#if-we-catch-a-model-escaping-will-we-do-anything-005015)
- [3.11 Getting AI models to think they've already escaped \[00:52:51\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#getting-ai-models-to-think-theyve-already-escaped-005251)
- [3.12 Will they be able to tell it's a setup? \[00:58:11\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#will-they-be-able-to-tell-its-a-setup-005811)
- [3.13 Will AI companies do any of this stuff? \[01:00:11\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#will-ai-companies-do-any-of-this-stuff-010011)
- [3.14 Can we just give AIs fewer permissions? \[01:06:14\]](https://80000hours.org/podcast/episodes/buck-shlegeris-ai-control-scheming/#can-we-just-give-ais-fewer-permissions-010614)
- [3.15 Can we stop human spies the same
... (truncated, 98 KB total)Resource ID:
8ec0801f7e9420fb | Stable ID: sid_0CVVwwXSsQ