An update on our general capability evaluations - METR

web

METR·metr.org/blog/2024-08-06-update-on-evaluations

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: METR

Metadata

Cited by 1 page

Page	Type	Quality
Alignment Research Center (ARC)	Organization	57.0

Cached Content Preview

HTTP 200Fetched May 31, 202616 KB

An update on our general capability evaluations - METR 

 
 
 
 
 

 

 
 

 

 

 
 

 

 
 

 

 
 
 
 
 
 
 

 

 

 
 
 

 

 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 
 
 
 
 
 Our Work 
 
 
 
 
 
 
 
 
 
 
 Research 
 

 
 
 
 
 
 
 
 Notes 
 

 
 
 
 
 
 
 
 Updates 
 

 
 
 
 
 
 
 
 Risk Assessment 
 

 
 
 
 

 
 
 
 
 
 
 
 
 About 
 

 
 
 
 
 
 
 
 
 Donate 
 

 
 
 
 
 
 
 
 
 Careers 
 

 
 
 

 
 
 Search 
 

 
 
 

 
 
 

 
 
 

 

 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 -->
 
 
 
 

 

 
 
 
 
 
 
 
 
 
 
 
 
 Our Work 
 

 
 
 
 
 
 
 
 
 
 

 
 Research 
 
 

 

 
 
 
 
 
 
 
 
 

 
 Notes 
 
 

 

 
 
 
 
 
 
 
 
 

 
 Updates 
 
 

 

 
 
 
 
 
 
 
 
 

 
 Risk Assessment 
 
 

 

 
 

 

 
 
 
 
 
 
 
 
 
 
 About 
 

 
 
 
 
 
 
 
 
 
 
 Donate 
 

 
 
 
 
 
 
 
 
 
 
 Careers 
 

 
 

 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 

 

 
 

 
 Menu 
 

 
 

 

 
 
 
 
 
 &times; 
 
 

 
 An update on our general capability evaluations 
 
 
 
 
 
 
 
 
 CONTRIBUTORS

 
 
 
 
 
 
 
 Beth Barnes 
 
 
 
 
 
 
 
 
 DATE

 August 6, 2024 
 
 
 
 

 
 
 SHARE

 
 
 Copy Link
 
 
 
 Citation
 
 
 
 
 BibTeX Citation 
 &times; 
 
 
 

 @misc { metr-2024-update-on-evaluations , 
 title = {An update on our general capability evaluations} , 
 author = {Beth Barnes} , 
 howpublished = {\url{https://metr.org/blog/2024-08-06-update-on-evaluations/}} , 
 year = {2024} , 
 month = {08} , 
 } 
 
 Copy 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 

 
 
 
 
 
 
 Since releasing our autonomy evaluation resources , METR has primarily focused on developing better measures of AI capabilities. We are currently developing evaluations for AI R&D capabilities that may serve as “red lines”, 1 as well as a continuous measure of autonomous capabilities in general. In this update, we’ll talk about our recent efforts on developing evaluations for general autonomous capabilities – we plan on providing an update on our AI R&D evaluations in the near future. Note that this post provides an intermediate research update for those interested in our work and includes preliminary results that should not be considered definitive.

 Why not focus solely on “red line” evaluations for specific threat models? Unlike the ability to develop bioweapons or execute high-value cyberattacks, possessing autonomy “in general” doesn’t directly enable an AI system to cause catastrophically bad outcomes. However, we still think that it’s worth developing more general measures of autonomous capabilities for several reasons:

 
 Autonomy is a proxy for the extent to which an AI system can have far reaching impacts on the world, with minimal human involvement. We believe that such a metric might be robustly useful across a variety of threat models.

 One of our goals for evaluations is to assist with forecasting the capabilities or impacts of AI systems. For this, we expect a general capability measure that provides signal at various scales to be more useful than evaluations that contain only hard tasks more dir

... (truncated, 16 KB total)

Resource ID: 81cf326834f21a67 | Stable ID: sid_lfTI2FIzkg