Skip to content
Longterm Wiki
Back

AI models can be dangerous before public deployment (METR, November 13, 2024)

web

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: METR

Published by METR (Model Evaluation and Threat Research) in November 2024, this post is relevant to discussions of responsible scaling policies and the timing of safety evaluations in the AI development pipeline.

Metadata

Importance: 68/100blog postanalysis

Summary

METR argues that AI models pose significant safety risks during the pre-deployment phase, not just after public release. The post highlights that dangerous capabilities can be elicited from models before they reach the public, making pre-deployment evaluation and safeguards critical. This challenges assumptions that safety interventions only need to focus on publicly deployed systems.

Key Points

  • Dangerous capabilities in AI models can be exploited before public deployment, not just after release to end users.
  • Pre-deployment access (e.g., by researchers, developers, or via API) creates real threat vectors that safety frameworks must address.
  • Current safety thinking may underweight risks from models that haven't yet been publicly released but are already accessible to some actors.
  • METR advocates for robust evaluation and risk assessment processes applied throughout development, not only at the point of public launch.
  • The post emphasizes the need for safety protocols and governance structures that account for the full model lifecycle.

Cached Content Preview

HTTP 200Fetched Mar 20, 20260 KB
[![METR Logo](https://metr.org/assets/images/logo/logo.svg)](https://metr.org/)

- [Research](https://metr.org/research)
- [Notes](https://metr.org/notes)
- [Updates](https://metr.org/blog)
- [About](https://metr.org/about)
- [Donate](https://metr.org/donate)
- [Careers](https://metr.org/careers)

Menu

# Page not found
Resource ID: bf5895c6682e4866 | Stable ID: Zjg1NWRiNz