AI Lab Watch: Commitments Tracker

web

ailabwatch.org·ailabwatch.org/resources/commitments

Useful for researchers and policymakers tracking the gap between AI lab safety rhetoric and demonstrated practice; complements formal regulatory frameworks by documenting voluntary commitments.

Metadata

Importance: 62/100tool pagereference

Summary

AI Lab Watch's Commitments Tracker monitors and evaluates the public safety commitments made by major AI laboratories, tracking whether frontier AI companies are honoring pledges related to safety, governance, and responsible deployment. It serves as an accountability tool by systematically documenting what labs have promised and assessing follow-through.

Key Points

•Tracks public safety commitments made by frontier AI labs such as OpenAI, Anthropic, DeepMind, and others
•Provides accountability by comparing stated commitments against actual lab behavior and policies
•Covers areas including safety testing, whistleblower protections, government reporting, and international coordination
•Serves as a watchdog resource for civil society and policymakers evaluating AI lab trustworthiness
•Aggregates commitments from voluntary pledges, government agreements, and industry frameworks in one reference

Cited by 7 pages

Page	Type	Quality
Corporate Influence on AI Policy	Crux	66.0
Corporate AI Safety Responses	Approach	68.0
AI Policy Effectiveness	Analysis	64.0
International AI Safety Summit Series	Event	63.0
AI Lab Safety Culture	Approach	62.0
Tool-Use Restrictions	Approach	91.0
AI Whistleblower Protections	Policy	63.0

Cached Content Preview

HTTP 200Fetched May 17, 202621 KB

### [AI Lab Watch](https://ailabwatch.org/)

[**Categories**](https://ailabwatch.org/categories) [**Companies**](https://ailabwatch.org/companies) [**Resources**](https://ailabwatch.org/resources) [**Blog**](https://ailabwatch.substack.com/) [**About**](https://ailabwatch.org/about)

This page collects AI companies' commitments relevant to AI safety and extreme risks.

# Commitments by several companies

## White House voluntary commitments

The [White House voluntary commitments](https://bidenwhitehouse.archives.gov/wp-content/uploads/2023/07/Ensuring-Safe-Secure-and-Trustworthy-AI.pdf) were joined by [Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI in July 2023](https://bidenwhitehouse.archives.gov/briefing-room/statements-releases/2023/07/21/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-leading-artificial-intelligence-companies-to-manage-the-risks-posed-by-ai/); [Adobe, Cohere, IBM, Nvidia, Palantir, Salesforce, Scale AI, and Stability AI in September 2023](https://bidenwhitehouse.archives.gov/briefing-room/statements-releases/2023/09/12/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-eight-additional-artificial-intelligence-companies-to-manage-the-risks-posed-by-ai/); and [Apple in July 2024](https://bidenwhitehouse.archives.gov/briefing-room/statements-releases/2024/07/26/fact-sheet-biden-harris-administration-announces-new-ai-actions-and-receives-additional-major-voluntary-commitment-on-ai/).

The commitments "apply only to generative models that are overall more powerful than the current most advanced model produced by the company making the commitment," but all relevant companies have created more powerful models since making the commitments. The commitments most relevant to safety are:

- "\[I\]nternal and external red-teaming of models or systems in areas including misuse, societal risks, and national security concerns, such as bio, cyber, \[autonomous replication,\] and other safety areas."
- "\[A\]dvanc\[e\] ongoing research in AI safety, including on the interpretability of AI systems' decision-making processes and on increasing the robustness of AI systems against misuse."
- "Work toward information sharing among companies and governments regarding trust and safety risks, dangerous or emergent capabilities, and attempts to circumvent safeguards": "establish or join a forum or mechanism through which they can develop, advance, and adopt shared standards and best practices for frontier AI safety."
- "Invest in cybersecurity and insider threat safeguards to protect proprietary and unreleased model weights": "limit\[\] access to model weights to those whose job function requires it and establish\[\] a robust insider threat detection program consistent with protections provided for their most valuable intellectual property and trade secrets. In addition . . . stor\[e\] and work\[\] with the weights in an appropriately secure environment to reduce the risk of unsanct

... (truncated, 21 KB total)

Resource ID: 91ca6b1425554e9a | Stable ID: sid_96geMqeZUr