Skip to content
Longterm Wiki
Back

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Google DeepMind

An institutional safety framework from Google DeepMind, comparable to Anthropic's Responsible Scaling Policy and OpenAI's Preparedness Framework; useful for understanding industry approaches to capability thresholds and deployment-gated safety commitments.

Metadata

Importance: 72/100blog postprimary source

Summary

DeepMind's Frontier Safety Framework (FSF) establishes a structured approach to identifying and mitigating catastrophic risks from highly capable AI models before and during deployment. It introduces 'Critical Capability Levels' (CCLs) as thresholds that trigger enhanced safety evaluations, and outlines mitigation measures to prevent severe harms such as bioweapons development or AI autonomously undermining human oversight. The framework represents a concrete institutional commitment to capability-gated safety protocols.

Key Points

  • Defines 'Critical Capability Levels' (CCLs) as specific thresholds in areas like biosecurity, cybersecurity, and autonomous AI that trigger mandatory safety reviews.
  • Establishes a two-part mitigation requirement: deployment controls to prevent misuse and model-level safeguards to prevent autonomous harmful actions.
  • Focuses on two primary risk domains: uplift to catastrophic weapons (CBRN) and AI systems capable of undermining human oversight mechanisms.
  • Commits to regular evaluation of frontier models against CCLs before training completion and before deployment.
  • Positions FSF as a living document intended to evolve alongside capability advances and emerging safety research.

Cited by 6 pages

Cached Content Preview

HTTP 200Fetched Mar 20, 20268 KB
[Skip to main content](https://deepmind.google/blog/introducing-the-frontier-safety-framework/#page-content)

May 17, 2024
Responsibility & Safety

# Introducing the Frontier Safety Framework

Anca Dragan, Helen King and Allan Dafoe

Share

![](https://lh3.googleusercontent.com/V_XzHPLhLkfnMZCz7Td88iJr1B30cQJiVlnIpr4gGij03QCJrs1DVKOjdZq_rfIqOfjsJEbEh6QiGWIJq_eKZidzX_uP1Wi6e4tIuI_vn2AMheYfKH0=w1440-h810-n-nu)

Our approach to analyzing and mitigating future risks posed by advanced AI models

Google DeepMind has consistently pushed the boundaries of AI, developing models that have transformed our understanding of what's possible. We believe that AI technology on the horizon will provide society with invaluable tools to help tackle critical global challenges, such as climate change, drug discovery, and economic productivity. At the same time, we recognize that as we continue to advance the frontier of AI capabilities, these breakthroughs may eventually come with new risks beyond those posed by present-day models.

Today, we are introducing our [Frontier Safety Framework](https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/introducing-the-frontier-safety-framework/fsf-technical-report.pdf) — a set of protocols for proactively identifying future AI capabilities that could cause severe harm and putting in place mechanisms to detect and mitigate them. Our Framework focuses on severe risks resulting from powerful capabilities at the model level, such as exceptional agency or sophisticated cyber capabilities. It is designed to complement our alignment research, which trains models to act in accordance with human values and societal goals, and Google’s existing suite of AI responsibility and safety [practices](https://ai.google/responsibility/principles/).

The Framework is exploratory and we expect it to evolve significantly as we learn from its implementation, deepen our understanding of AI risks and evaluations, and collaborate with industry, academia, and government. Even though these risks are beyond the reach of present-day models, we hope that implementing and improving the Framework will help us prepare to address them. We aim to have this initial framework fully implemented by early 2025.

## The framework

The first version of the Framework announced today builds on our [research](https://deepmind.google/discover/blog/an-early-warning-system-for-novel-ai-risks/) on [evaluating](https://arxiv.org/abs/2403.13793) critical capabilities in frontier models, and follows the emerging approach of [Responsible Capability Scaling.](https://www.gov.uk/government/publications/emerging-processes-for-frontier-ai-safety/emerging-processes-for-frontier-ai-safety) The Framework has three key components:

1. **Identifying capabilities a model may have with potential for severe harm.** To do this, we research the paths through which a model could cause severe harm in high-risk domains, and then determine the minimal level of capabilities a model must ha

... (truncated, 8 KB total)
Resource ID: d8c3d29798412b9f | Stable ID: NzZkNWU4MT