Back
OECD-affiliated survey on AI thresholds
webPublished via Oxford's AIGI and affiliated with the OECD, this survey is directly relevant to ongoing international efforts (e.g., AI Safety Institutes, EU AI Act) to establish capability thresholds that trigger regulatory scrutiny of frontier AI models.
Metadata
Importance: 62/100policy briefanalysis
Summary
This OECD-affiliated survey examines how thresholds and capability benchmarks should be defined and applied to advanced AI systems for governance and risk management purposes. It likely synthesizes expert views on identifying dangerous capability levels and triggering regulatory or safety interventions. The work is relevant to policymakers and AI developers designing evaluation frameworks for frontier models.
Key Points
- •Surveys expert and stakeholder perspectives on defining capability thresholds that trigger heightened oversight or regulatory action for advanced AI.
- •Examines how thresholds can be operationalized across different risk categories, including biosecurity, cybersecurity, and autonomous decision-making.
- •Connects threshold-setting to governance frameworks, supporting OECD-aligned international policy coordination on frontier AI.
- •Addresses challenges in threshold design such as measurement reliability, evasion risk, and keeping pace with rapidly advancing capabilities.
- •Produced in collaboration with the Oxford AI Governance Initiative (AIGI), bridging academic research and international policy institutions.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| AI Capability Threshold Model | Analysis | 72.0 |
Cached Content Preview
HTTP 200Fetched Mar 15, 202698 KB
SURVEY ON THRESHOLDS
FOR ADVANCED AI SYSTEMS
WHITE PAPER
August 2025
Authors: Jonas Schuett, Eunseo Choi, Kasumi Sugimoto, Bosco Hung,
Robert Trager, and Karine Perset
-- 1 of 34 --
Survey on thresholds for advanced AI systems
Jonas Schuett∗ 1 Eunseo Choi2 Kasumi Sugimoto2
Bosco Hung3 Robert Trager4 Karine Perset2
1Centre for the Governance of AI 2OECD.AI
3University of Oxford 4Oxford Martin AI Governance Initiative
Abstract
Governments around the world have recognised the need to manage risks from
advanced artificial intelligence (AI) systems. Thresholds are discussed as a potential
governance tool that could be used to determine when additional risk assessment
or mitigation measures are warranted. However, it remains unclear what specific
thresholds would be appropriate and how they should be set. This paper reports
findings from an expert survey (N = 166) and a public consultation conducted
between August 2024 and October 2024. The expert survey asked participants to
indicate their level of agreement with 98 statements about thresholds for advanced
AI systems. The public consultation provided an opportunity for the general public
to contribute perspectives that may not have been captured by the expert survey.
Participants generally agreed that thresholds should be set by multiple stakeholders
and that there should be different types of threshold, each serving a specific purpose.
Participants were divided on the question of what role training compute thresholds
should play, how exactly different types of thresholds should be set, and how many
thresholds there should be. These findings can serve as evidence in ongoing policy
discussions, yet more research is needed.
∗Corresponding author: jonas.schuett@governance.ai.
-- 2 of 34 --
Contents
Executive Summary 3
1 Introduction 7
2 Methods 7
2.1 Method for the expert survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Survey questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Method for the public consultation . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Method for data triangulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Results 11
3.1 Results from the expert survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Results from the public consultation . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 Discussion 15
4.1 General results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Specific results for different types of thresholds . . . . . . . . . . . . . . . . . . . 16
4.2.1 Training compute thresholds . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2.2 Capability thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2.3 Red lines . . . . . . . . . . . . . . . .
... (truncated, 98 KB total)Resource ID:
169aa2527260ab17 | Stable ID: ZWVjYTFhNW