Carnegie Endowment: If-Then Commitments for AI Risk Reduction
webCredibility Rating
High quality. Established institution or organization with editorial oversight and accountability.
Rating inherited from publication venue: Carnegie Endowment
Published by Carnegie Endowment in September 2024, this policy analysis is relevant for understanding conditional governance mechanisms that could operationalize AI safety commitments made by frontier labs and governments, complementing frameworks like the Bletchley Declaration and voluntary safety pledges.
Metadata
Summary
This Carnegie Endowment report proposes 'if-then' conditional commitment frameworks for AI governance, where AI developers and governments pre-commit to specific risk-mitigation actions triggered by defined capability thresholds or adverse events. The approach aims to reduce AI risks by creating credible, enforceable pledges that activate before harms materialize. It bridges voluntary industry commitments and binding regulation.
Key Points
- •Proposes conditional 'if-then' commitments where specific AI risk triggers automatically require predefined mitigation responses from developers or governments.
- •Addresses the gap between voluntary AI safety pledges (often vague) and hard regulation (slow to enact), offering a middle-ground mechanism.
- •Commitments could cover scenarios like capability thresholds, dangerous evaluations results, or incident reports triggering deployment pauses or third-party audits.
- •Framework emphasizes credibility and verifiability, requiring commitments to be specific, measurable, and subject to oversight.
- •Relevant to ongoing debates around frontier AI governance, safety frameworks like those from major labs, and international coordination efforts.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| AI Policy Effectiveness | Analysis | 64.0 |
Cached Content Preview

Source: Getty
Paper
## If-Then Commitments for AI Risk Reduction
If-then commitments are an emerging framework for preparing for risks from AI without unnecessarily slowing the development of new technology. The more attention and interest there is in these commitments, the faster a mature framework can progress.
English
Link Copied
By[Holden Karnofsky](https://carnegieendowment.org/people/holden-karnofsky)
Published onSep 13, 2024
### Introduction
Artificial intelligence (AI) could pose a variety of catastrophic risks to international security in several domains, including the proliferation and acceleration of cyberoffense capabilities, and of the ability to develop chemical or biological weapons of mass destruction. Even the most powerful AI models today are not yet capable enough to pose such risks,1 but the coming years could see fast and hard-to-predict changes in AI capabilities. Both companies and governments have shown significant interest in finding ways to prepare for such risks without unnecessarily slowing the development of new technology.
This piece is a primer on an emerging framework for handling this challenge: if-then commitments. These are commitments of the form: _If an AI model has capability X, risk mitigations Y must be in place. And, if needed, we will delay AI deployment and/or development to ensure the mitigations can be present in time._ A specific example: _If an AI model has the ability to walk a novice through constructing a weapon of mass destruction, we must ensure that there are no easy ways for consumers to elicit behavior in this category from the AI model._
If-then commitments can be voluntarily adopted by AI developers; they also, potentially, can be enforced by regulators. Adoption of if-then commitments could help reduce risks from AI in two key ways: (a) prototyping, battle-testing, and building consensus around a potential framework for regulation; and (b) helping AI developers and others build roadmaps of what risk mitigations need to be in place by when. Such adoption does not require agreement on whether major AI risks are imminent—a polarized topic—only that certain situations _would_ require certain risk mitigations _if_ they came to pass.
Three industry leaders— [Google DeepMind](https://deepmind.google/discover/blog/introducing-the-frontier-safety-framework/), [OpenAI](https://openai.com/preparedness/), and [Anthropic](https://www.anthropic.com/index/anthropics-responsible-scaling-policy)—have published relatively detailed frameworks along these lines. Sixteen companies have [announced](https://www.gov.uk/government/publications/frontier-ai-safety-commitments-ai-seoul-summit-2024/frontier-ai-safety-commitments-ai-seoul-summit-2024) their intention to establish frameworks in a similar spirit by the time of the upcoming 2025 AI Action Su
... (truncated, 84 KB total)80b0765e1dfc4afd | Stable ID: NjU0MmEzNG