Skip to content
Longterm Wiki
Back

Credibility Rating

2/5
Mixed(2)

Mixed quality. Some useful content but inconsistent editorial standards. Claims should be verified.

Rating inherited from publication venue: Medium

Official update from Google DeepMind's main technical safety team, useful for understanding the institutional structure and research priorities of one of the largest industry AI safety groups as of late 2024.

Metadata

Importance: 62/100blog postprimary source

Summary

A 2024 overview by Google DeepMind's AGI Safety & Alignment team summarizing their recent technical work on existential risk from AI, covering subteams focused on mechanistic interpretability, scalable oversight, and frontier safety evaluations. Written by Rohin Shah, Seb Farquhar, and Anca Dragan, it describes the team's structure, growth, and key research priorities including amplified oversight and dangerous capability evaluations.

Key Points

  • The AGI Safety & Alignment team has grown ~39% and ~37% in consecutive years, encompassing subteams in mechanistic interpretability, scalable oversight, and frontier safety.
  • Big bets for 2023-2024 include amplified oversight (to generate correct learning signals for alignment) and other technical safety approaches.
  • The Frontier Safety subteam develops and runs dangerous capability evaluations as part of DeepMind's Frontier Safety Framework.
  • The broader org also includes Gemini Safety (safety training for current models) and Voices of All in Alignment (value/viewpoint pluralism techniques).
  • The post is intended to help external researchers build on DeepMind's work and understand how their own research connects to this agenda.

Cited by 3 pages

Cached Content Preview

HTTP 200Fetched Mar 15, 202640 KB
[Sitemap](https://deepmindsafetyresearch.medium.com/sitemap/sitemap.xml)

[Open in app](https://play.google.com/store/apps/details?id=com.medium.reader&referrer=utm_source%3DmobileNavBar&source=post_page---top_nav_layout_nav-----------------------------------------)

Sign up

[Sign in](https://medium.com/m/signin?operation=login&redirect=https%3A%2F%2Fdeepmindsafetyresearch.medium.com%2Fagi-safety-and-alignment-at-google-deepmind-a-summary-of-recent-work-8e600aca582a&source=post_page---top_nav_layout_nav-----------------------global_nav------------------)

[Medium Logo](https://medium.com/?source=post_page---top_nav_layout_nav-----------------------------------------)

Get app

[Write](https://medium.com/m/signin?operation=register&redirect=https%3A%2F%2Fmedium.com%2Fnew-story&source=---top_nav_layout_nav-----------------------new_post_topnav------------------)

[Search](https://medium.com/search?source=post_page---top_nav_layout_nav-----------------------------------------)

Sign up

[Sign in](https://medium.com/m/signin?operation=login&redirect=https%3A%2F%2Fdeepmindsafetyresearch.medium.com%2Fagi-safety-and-alignment-at-google-deepmind-a-summary-of-recent-work-8e600aca582a&source=post_page---top_nav_layout_nav-----------------------global_nav------------------)

![](https://miro.medium.com/v2/resize:fill:32:32/1*dmbNkD5D-u45r44go_cf0g.png)

# AGI Safety and Alignment at Google DeepMind: A Summary of Recent Work

[![DeepMind Safety Research](https://miro.medium.com/v2/resize:fill:32:32/2*y3lgushvo5U-VptVQbSX9Q.png)](https://deepmindsafetyresearch.medium.com/?source=post_page---byline--8e600aca582a---------------------------------------)

[DeepMind Safety Research](https://deepmindsafetyresearch.medium.com/?source=post_page---byline--8e600aca582a---------------------------------------)

Follow

11 min read

·

Oct 18, 2024

9

[Listen](https://medium.com/m/signin?actionUrl=https%3A%2F%2Fmedium.com%2Fplans%3Fdimension%3Dpost_audio_button%26postId%3D8e600aca582a&operation=register&redirect=https%3A%2F%2Fdeepmindsafetyresearch.medium.com%2Fagi-safety-and-alignment-at-google-deepmind-a-summary-of-recent-work-8e600aca582a&source=---header_actions--8e600aca582a---------------------post_audio_button------------------)

Share

_By Rohin Shah, Seb Farquhar, and Anca Dragan_

It’s been a while since our last update, so we wanted to share a recap of our recent outputs. Below, we fill in some details about what we have been working on, what motivated us to do it, and how we thought about its importance. We hope that this will help people build off things we have done and see how their work fits with ours.

## Who are we?

We’re the main team at Google DeepMind working on technical approaches to existential risk from AI systems. Since our [last post](https://www.alignmentforum.org/posts/nzmCvRvPm4xJuqztv/deepmind-is-hiring-for-the-scalable-alignment-and-alignment), we’ve evolved into the AGI Safety & Alignment team, which we think of as AGI Alignment (with s

... (truncated, 40 KB total)
Resource ID: 6374381b5ec386d1 | Stable ID: NWJkZjJiZj