Skip to content
Longterm Wiki
Back

Research from Owain Evans and colleagues

web
theinsideview.ai·theinsideview.ai/owain

Part of The Inside View interview series by Michaël Trazzi, featuring conversations with AI safety researchers; Owain Evans is known for foundational work on honest AI and eliciting latent knowledge at ARC and Oxford.

Metadata

Importance: 58/100homepagecommentary

Summary

An interview with Owain Evans, AI safety researcher known for work on scalable oversight, reward modeling, and value alignment. The discussion likely covers his research agenda on eliciting latent knowledge, honest AI, and approaches to ensuring AI systems behave safely and according to human values.

Key Points

  • Owain Evans is a prominent AI safety researcher at Oxford/ARC focused on scalable oversight and honest AI behavior
  • His work includes research on reward modeling, eliciting latent knowledge (ELK), and detecting deceptive alignment
  • The Inside View series provides in-depth interviews with leading AI safety researchers about their work and perspectives
  • Key themes likely include how to verify AI honesty and whether AI systems believe what they say
  • Research explores methods for humans to maintain oversight as AI systems become more capable than human evaluators

Cited by 1 page

Cached Content Preview

HTTP 200Fetched Mar 20, 202693 KB
[..](https://theinsideview.ai/)

2024-08-23

# Owain Evans on Situational Awareness

Owain Evans - AI Situational Awareness, LLM Out-of-Context Reasoning - YouTube

[Photo image of The Inside View](https://www.youtube.com/channel/UCb9F9_uV24PGj6x63PhXEVw?embeds_referring_euri=https%3A%2F%2Ftheinsideview.ai%2F)

The Inside View

15K subscribers

[Owain Evans - AI Situational Awareness, LLM Out-of-Context Reasoning](https://www.youtube.com/watch?v=eb2oLHblrHU)

The Inside View

Search

Info

Shopping

Tap to unmute

If playback doesn't begin shortly, try restarting your device.

You're signed out

Videos you watch may be added to the TV's watch history and influence TV recommendations. To avoid this, cancel and sign in to YouTube on your computer.

CancelConfirm

Share

Include playlist

An error occurred while retrieving sharing information. Please try again later.

Watch later

Share

Copy link

Watch on

0:00

0:00 / 2:15:47

•Live

•

Owain Evans - AI Situational Awareness, Out-of-Context Reasoning - The Inside View \| Spotify

[Play on Spotify](https://open.spotify.com/episode/28tHehh7lzVKLcLINkJRN2?go=1&sp_cid=2d23829d8d90e08226d25b784d5926fe&utm_source=embed_player_p&utm_medium=desktop "Play on Spotify")

# [Owain Evans - AI Situational Awareness, Out-of-Context Reasoning](https://open.spotify.com/episode/28tHehh7lzVKLcLINkJRN2?go=1&sp_cid=2d23829d8d90e08226d25b784d5926fe&utm_source=embed_player_p&utm_medium=desktop)

PreviewE

## Aug 23· [The Inside View](https://open.spotify.com/show/4RCrvZRg9PiEspGzcZ8XJQ?go=1&sp_cid=2d23829d8d90e08226d25b784d5926fe&utm_source=embed_player_p&utm_medium=desktop)

Save on Spotify

- [Play on Spotify](https://open.spotify.com/episode/28tHehh7lzVKLcLINkJRN2?go=1&sp_cid=2d23829d8d90e08226d25b784d5926fe&utm_source=embed_player_p&utm_medium=desktop)
- Follow on Spotify
- Copy link

[Privacy Policy](https://www.spotify.com/legal/privacy-policy/)· [Terms & Conditions](https://www.spotify.com/legal)

Owain Evans is an AI Alignment researcher, research associate at the [Center of Human Compatible AI](https://humancompatible.ai/) at UC Berkeley, and now leading a new AI safety research group.

In this episode we discuss two of his recent papers, “ [Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs](https://arxiv.org/abs/2407.04694)” and “ [Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data](https://arxiv.org/abs/2406.14546)”, alongside some Twitter [questions](https://x.com/MichaelTrazzi/status/1823376152783880609).

_(Our conversation is ~2h15 long, so feel free to click on any sub-topic of your liking in the Outline below. At any point you can come back by clicking on the up-arrow ⬆ at the end of sections)_

# Contents

- [Highlighted](https://theinsideview.ai/owain#highlighted-quotes)
- [Me Myself and AI: The Situational Awareness Dataset for LLMs](https://theinsideview.ai/owain#me-myself-and-ai-the-situational-awareness-dataset-for-llms)
  - [Definin

... (truncated, 93 KB total)
Resource ID: f0e47fd7657fd428 | Stable ID: NjRmMjUwNT