Skip to content
Longterm Wiki
Back

Google DeepMind: Frontier Safety Framework Version 3.0

web

This is Google DeepMind's official safety framework document (version 3.0), analogous to Anthropic's RSP and OpenAI's Preparedness Framework; essential reference for understanding industry approaches to frontier AI risk management.

Metadata

Importance: 78/100organizational reportprimary source

Summary

Google DeepMind's Frontier Safety Framework (v3.0) defines protocols for identifying Critical Capability Levels (CCLs) at which frontier AI models may pose severe risks, and outlines mitigation approaches across three risk categories: misuse, ML R&D acceleration, and misalignment. The framework specifies risk assessment processes, response plans, and criteria for evaluating whether mitigations are sufficient before deployment.

Key Points

  • Defines 'Critical Capability Levels' (CCLs) as capability thresholds requiring specific mitigations across misuse, ML R&D, and misalignment risk categories.
  • Outlines a structured risk assessment process including evaluation of cross-cutting capabilities like agency, tool use, reasoning, and scientific understanding.
  • Acknowledges that some mitigations only provide societal value if adopted industry-wide, highlighting the need for coordinated action across AI labs.
  • Includes an exploratory approach to misalignment risk, reflecting the nascent state of research on detecting and mitigating misaligned AI systems.
  • Commits to periodic review and evolution of the framework as understanding of frontier model risks and capabilities improves.

Cited by 2 pages

PageTypeQuality
Corporate AI Safety ResponsesApproach68.0
AI Safety CasesApproach91.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202636 KB
# Frontier Safety Framework

Version 3.0

# Frontier Safety Framework

# Overview

The Frontier Safety Framework is a set of protocols that aims to address severe risks that may arise from the high-impact capabilities of frontier AI models. It complements Google’s suite of AI responsibility and safety practices, and enables AI innovation and deployment consistent with our AI Principles.

The Framework is informed by the broader conversation on Frontier AI Safety and Security Frameworks. The core components of such Frameworks are to:

Identify capability levels at which frontier AI models, without additional mitigations, could pose severe risk. Implement protocols to detect the attainment of such capability levels throughout the model lifecycle. Prepare and articulate proactive mitigation plans to ensure severe risks are adequately mitigated when such capability levels are attained. ● Where required or appropriate, involve external parties to help inform and guide the approach.

In the Framework, we specify protocols for the detection of capability levels at which frontier AI models may pose severe risks (which we call “Critical Capability Levels (CCLs)”), and articulate mitigation approaches to address such risks. The Framework addresses misuse risk,2 risks from machine learning research and development (ML R&D), and misalignment risk.3 For each type of risk, we define a set of CCLs and a mitigation approach for them. Risk assessment will necessarily involve evaluating cross-cutting capabilities such as agency, tool use, reasoning, and scientific understanding.

The safety and security of frontier AI models is a global public good. The protocols here represent our current understanding and approach of how severe frontier AI risks may be anticipated and addressed. Importantly, there are certain mitigations whose social value is significantly reduced if not broadly applied to frontier AI models reaching critical capabilities. These mitigations are most effective when adopted by industry as a whole: our adoption of them would result in effective risk mitigation for society only if all relevant organisations provide similar levels of protection.

The Framework is based on early and evolving research. We may change our approach over time as we gain experience and insights on the projected capabilities of future frontier models. We will review the Framework periodically and we expect it to evolve substantially as our understanding of the risks and benefits of frontier models improves.

# Frontier Safety Framework

Table of Contents

Section 1: Framework 4

1.1 Scope 4

1.2 Critical Capability Levels 4

1.3 Outline of Our Risk Assessment Process 4

1.4 Response Plans and Mitigations 6

1.5 Evaluating Mitigations 6

1.6 Summary of Risk Acceptance Criteria 6

Section 2: Misuse 8

2.1 Mitigation Approach 8

2.2 Misuse Critical Capability Levels 9

Section 3: Machine Learning R&D 12

3.1 Mitigation Approach 12

3.2 ML R&D Critical Capability Levels 13

Sect

... (truncated, 36 KB total)
Resource ID: 3c56c8c2a799e4ef | Stable ID: ZDk3MjIyZT