Skip to content
Longterm Wiki
Back

OpenAI: Preparedness Framework Version 2

web

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: OpenAI

OpenAI's official internal safety governance framework, updated as v2; relevant for understanding how a leading frontier AI lab operationalizes risk thresholds and pre-deployment evaluation processes.

Metadata

Importance: 72/100organizational reportprimary source

Summary

OpenAI's Preparedness Framework v2 outlines the company's structured approach to evaluating and managing catastrophic risks from frontier AI models, including definitions of risk severity levels and thresholds that determine whether a model can be deployed or developed further. It establishes a systematic process for tracking, evaluating, and preparing for frontier model risks across domains such as CBRN threats, cyberattacks, and loss of human control. The framework represents OpenAI's operationalized safety commitments with concrete governance mechanisms.

Key Points

  • Defines a four-tier risk severity scale (low, medium, high, critical) across key risk categories including CBRN, cybersecurity, persuasion/influence ops, and model autonomy.
  • Sets deployment and development thresholds: models rated 'critical' risk cannot be deployed; models rated 'high' require additional mitigations before release.
  • Establishes a Preparedness team responsible for conducting 'scorecards' evaluating frontier models against defined risk benchmarks before and after deployment.
  • Includes a safety advisory group and board-level oversight mechanism to ensure accountability beyond standard product teams.
  • Represents a living document intended to evolve as capabilities and understanding of risks develop over time.

Cited by 3 pages

1 FactBase fact citing this source

EntityPropertyValueAs Of
OpenAIAI Safety LevelHigh/Critical capability thresholds (Preparedness Framework v2)Apr 2025

Cached Content Preview

HTTP 200Fetched Mar 20, 202662 KB
# Preparedness Framework Version 2. Last updated: 15th April, 2025

# OpenAl

OpenAI’s mission is to ensure that AGI (artificial general intelligence) benefits all of humanity. To pursue that mission, we are committed to safely developing and deploying highly capable AI systems, which create significant benefits and also bring new risks. We build for safety at every step and share our learnings so that society can make well-informed choices to manage new risks from frontier AI.

The Preparedness Framework is OpenAI’s approach to tracking and preparing for frontier capabilities that create new risks of severe harm.1 We currently focus this work on three areas of frontier capability, which we call Tracked Categories:

• Biological and Chemical capabilities that, in addition to unlocking discoveries and cures, can also reduce barriers to creating and using biological or chemical weapons.

• Cybersecurity capabilities that, in addition to helping protect vulnerable systems, can also create new risks of scaled cyberattacks and vulnerability exploitation.

• AI Self-improvement capabilities that, in addition to unlocking helpful capabilities faster, could also create new challenges for human control of AI systems.

In each area, we develop and maintain a threat model that identifies the risks of severe harm and sets thresholds we can measure to tell us when the models get capable enough to meaningfully pose these risks. We won’t deploy these very capable models until we’ve built safeguards to sufficiently minimize the associated risks of severe harm. This Framework lays out the kinds of safeguards we expect to need, and how we’ll confirm internally and show externally that the safeguards are sufficient.

In this updated version of the Framework we also introduce a set of Research Categories. These are areas of capability that could pose risks of severe harm, that do not yet meet our criteria to be Tracked Categories, and where we are investing now to further develop our threat models and capability elicitation techniques.

We are constantly refining our practices and advancing the science, to unlock the benefits of these technologies while addressing their risks. This revision of the Preparedness Framework focuses on the safeguards we expect will be needed for future models more capable than those we have today.

# Contents

1 Introduction 3

1.1 Why we’re updating the Preparedness Framework . 3

# Deciding where to focus

# 4

2.1 Holistic risk assessment and categorization 4

2.2 Tracked Categories 4

2.3 Research Categories 6

# 3 Measuring capabilities

# 8

3.1 Evaluation approach 8

3.2 Testing scope 9

3.3 Capability threshold determinations 9

# 4 Safeguarding against severe harm 10

4.1 Safeguard selection 10

4.2 Safeguard sufficiency 10

4.3 Marginal risk 12

4.4 Increasing safeguards before internal use and further development 12

# 5 Building trust

# 12

5.1 Internal governance 12

5.2 Transparency and external participation 12

A Change 

... (truncated, 62 KB total)
Resource ID: ec5d8e7d6a1b2c7c | Stable ID: YTFkNjM1Mz