Back
Research from Owain Evans and colleagues
webtheinsideview.ai·theinsideview.ai/owain
Part of The Inside View interview series by Michaël Trazzi, featuring conversations with AI safety researchers; Owain Evans is known for foundational work on honest AI and eliciting latent knowledge at ARC and Oxford.
Metadata
Importance: 58/100homepagecommentary
Summary
An interview with Owain Evans, AI safety researcher known for work on scalable oversight, reward modeling, and value alignment. The discussion likely covers his research agenda on eliciting latent knowledge, honest AI, and approaches to ensuring AI systems behave safely and according to human values.
Key Points
- •Owain Evans is a prominent AI safety researcher at Oxford/ARC focused on scalable oversight and honest AI behavior
- •His work includes research on reward modeling, eliciting latent knowledge (ELK), and detecting deceptive alignment
- •The Inside View series provides in-depth interviews with leading AI safety researchers about their work and perspectives
- •Key themes likely include how to verify AI honesty and whether AI systems believe what they say
- •Research explores methods for humans to maintain oversight as AI systems become more capable than human evaluators
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| AI Safety Technical Pathway Decomposition | Analysis | 62.0 |
Cached Content Preview
HTTP 200Fetched Mar 20, 202693 KB
[..](https://theinsideview.ai/)
2024-08-23
# Owain Evans on Situational Awareness
Owain Evans - AI Situational Awareness, LLM Out-of-Context Reasoning - YouTube
[Photo image of The Inside View](https://www.youtube.com/channel/UCb9F9_uV24PGj6x63PhXEVw?embeds_referring_euri=https%3A%2F%2Ftheinsideview.ai%2F)
The Inside View
15K subscribers
[Owain Evans - AI Situational Awareness, LLM Out-of-Context Reasoning](https://www.youtube.com/watch?v=eb2oLHblrHU)
The Inside View
Search
Info
Shopping
Tap to unmute
If playback doesn't begin shortly, try restarting your device.
You're signed out
Videos you watch may be added to the TV's watch history and influence TV recommendations. To avoid this, cancel and sign in to YouTube on your computer.
CancelConfirm
Share
Include playlist
An error occurred while retrieving sharing information. Please try again later.
Watch later
Share
Copy link
Watch on
0:00
0:00 / 2:15:47
•Live
•
Owain Evans - AI Situational Awareness, Out-of-Context Reasoning - The Inside View \| Spotify
[Play on Spotify](https://open.spotify.com/episode/28tHehh7lzVKLcLINkJRN2?go=1&sp_cid=2d23829d8d90e08226d25b784d5926fe&utm_source=embed_player_p&utm_medium=desktop "Play on Spotify")
# [Owain Evans - AI Situational Awareness, Out-of-Context Reasoning](https://open.spotify.com/episode/28tHehh7lzVKLcLINkJRN2?go=1&sp_cid=2d23829d8d90e08226d25b784d5926fe&utm_source=embed_player_p&utm_medium=desktop)
PreviewE
## Aug 23· [The Inside View](https://open.spotify.com/show/4RCrvZRg9PiEspGzcZ8XJQ?go=1&sp_cid=2d23829d8d90e08226d25b784d5926fe&utm_source=embed_player_p&utm_medium=desktop)
Save on Spotify
- [Play on Spotify](https://open.spotify.com/episode/28tHehh7lzVKLcLINkJRN2?go=1&sp_cid=2d23829d8d90e08226d25b784d5926fe&utm_source=embed_player_p&utm_medium=desktop)
- Follow on Spotify
- Copy link
[Privacy Policy](https://www.spotify.com/legal/privacy-policy/)· [Terms & Conditions](https://www.spotify.com/legal)
Owain Evans is an AI Alignment researcher, research associate at the [Center of Human Compatible AI](https://humancompatible.ai/) at UC Berkeley, and now leading a new AI safety research group.
In this episode we discuss two of his recent papers, “ [Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs](https://arxiv.org/abs/2407.04694)” and “ [Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data](https://arxiv.org/abs/2406.14546)”, alongside some Twitter [questions](https://x.com/MichaelTrazzi/status/1823376152783880609).
_(Our conversation is ~2h15 long, so feel free to click on any sub-topic of your liking in the Outline below. At any point you can come back by clicking on the up-arrow ⬆ at the end of sections)_
# Contents
- [Highlighted](https://theinsideview.ai/owain#highlighted-quotes)
- [Me Myself and AI: The Situational Awareness Dataset for LLMs](https://theinsideview.ai/owain#me-myself-and-ai-the-situational-awareness-dataset-for-llms)
- [Definin
... (truncated, 93 KB total)Resource ID:
f0e47fd7657fd428 | Stable ID: NjRmMjUwNT