[PDF] METR's AI Action Plan Comment
webCredibility Rating
4/5
High(4)High quality. Established institution or organization with editorial oversight and accountability.
Rating inherited from publication venue: METR
Metadata
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Third-Party Model Auditing | Approach | 64.0 |
Cached Content Preview
HTTP 200Fetched Apr 30, 202633 KB
# Model Evaluation and Threat Research
**— 440 N Barranca Ave #3345 Covina, CA 91723 —**
## March 15, 2025
Faisal D’Souza, NCO 2415 Eisenhower Avenue Alexandria, VA 22314
## Response to Request for Information on the Development of an Artificial
## Intelligence (AI) Action Plan
METR is a research nonprofit focused on developing the science of assessing AI systems for capabilities that could pose catastrophic risks to public safety and national security. In 2024 and early 2025, METR conducted evaluations of systems like 1 2 GPT-4.5 and Claude 3.5 Sonnet in partnership with their respective developers, typically before they were released to the public. METR has also advised many of the leading AI developers to help them establish frameworks for scaling up their risk 3 mitigations in proportion to the measured or forecasted abilities of their systems.
METR appreciates the opportunity to provide input into the shaping of the upcoming Artificial Intelligence Action Plan. We believe that this is a critical time for the U.S. government and other actors to plan soberly for the risks (and opportunities) that may come from further significant advancements in the state of AI capability. This response begins by outlining a few considerations that we believe are important for informing the general direction of the Plan. It then recommends several concrete actions that we believe merit prioritization within the Plan.
1 [https://cdn.openai.com/gpt-4-5-system-card-2272025.pdf](https://cdn.openai.com/gpt-4-5-system-card-2272025.pdf) [https://metr.org/blog/2025-01-31-update-sonnet-o1-evals/](https://metr.org/blog/2025-01-31-update-sonnet-o1-evals/) [http://metr.org/faisc](http://metr.org/faisc)
* * *
# Summary
We believe it will be important to plan around the following three major factors:
1. AI systems are rapidly improving at carrying out tasks independently over long time horizons.
2. AI systems are improving on AI R&D tasks, in a feedback loop that could be very destabilizing.
3. AI systems will plausibly end up with unintended and/or malign goals, conceal this fact, and resist attempts to remove these goals.
In light of these factors, we recommend that the AI Action Plan include at least the following four priority actions:
4. Collaborate with the private sector to improve the information security and internal controls achievable by the leading U.S. AI developers.
5. Lead creation of narrowly-tailored formal standards for how and when AI developers should measure:
a. Capabilities that would directly threaten public safety or national security, specifically CBRN weapons development and cyber-offense.
b. Capabilities that would amplify risks from AI systems, specifically long-horizon autonomy, automated AI R&D, and control subversion.
6. Prepare and consider interventions on the AI development and deployment process, as future circumstances may warrant, to:
a. Require reporting of training runs and deployments above some threshold
... (truncated, 33 KB total)Resource ID:
14441e0a6692bf9c | Stable ID: sid_qFMWOawMHA