Beth Barnes - Personal Homepage

web

Beth Barnes is a key figure in AI safety evaluations, particularly around dangerous capabilities and autonomous replication; her homepage serves as a hub for her research output and professional work, relevant to those following frontier model safety assessments.

Metadata

Importance: 55/100homepage

Summary

Personal homepage of Beth Barnes, an AI safety researcher known for work on evaluations, dangerous capabilities assessments, and autonomous replication risks in AI systems. The page likely links to her research, projects, and professional background in the AI safety field.

Key Points

•Beth Barnes is a prominent AI safety researcher focusing on evaluations and dangerous capabilities
•Her work includes developing frameworks for assessing autonomous replication and adaptation risks in AI
•She has been involved with organizations such as ARC Evals (now part of METR) focused on model evaluations
•Her research informs how frontier AI labs and policymakers assess dangerous capability thresholds
•Evaluations work contributes to responsible scaling policies and deployment decisions

Cited by 1 page

Page	Type	Quality
METR	Organization	66.0

Cached Content Preview

HTTP 200Fetched Apr 10, 20263 KB

barnes 
 
 
 
 
 
 
 
 
 
 
 
 
 Elizabeth (Beth) Barnes

 
 
 
 
 
 

 Founder & CEO at METR , building evaluations so we know if we're getting close to very risky AI. Formerly at DeepMind and OpenAI.
 

 Some research highlights:
 
 
 
 Measuring AI ability to complete long tasks
 - We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has 
 been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months. Extrapolating this trend 
 predicts that, in under a decade, we will see AI agents that can independently complete a large fraction of software tasks that currently take humans days or weeks.
 

 
 
 GPT-5 autonomy evaluation report
 - We evaluate whether GPT-5 poses significant catastrophic risks via AI self-improvement, 
 rogue replication, or sabotage of AI labs. We conclude that this seems unlikely. However, 
 capability trends continue rapidly, and models display increasing eval awareness.
 

 
 
 Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
 - We conduct a randomized controlled trial (RCT) to understand how early-2025 AI tools affect the productivity 
 of experienced open-source developers working on their own repositories. Surprisingly, we find that when 
 developers use AI tools, they take 19% longer than without: AI makes them slower. We view this result as a
 snapshot of early-2025 AI capabilities in one relevant setting; as these systems continue to rapidly evolve,
 we plan on continuing to use this methodology to help estimate AI acceleration from AI R&D automation 
 

 
 
 Resources for Autonomy Evaluations
 - task suite, evaluation protocol, estimates of the "elicitation gap"
 

 
 
 Evaluating LLM Agents on Realistic Autonomous Tasks
 
 

 
 
 Evaluating LLMs trained on code
 (alignment section)
 

 
 
 Obfuscated arguments problem
 - a problem with recursive-decomposition-based alignment approaches
 

 
 
 "Imitative generalisation"
 - explainer for Paul Christiano's 'Learning the Prior'
 

 
 
 Risks from AI persuasion
 - thoughts on the likelihood and consequences of superhuman persuasion before AGI
 

 
 
 Reflection mechanisms as an alignment target
 - work done by my AI safety camp mentees surveying Mechanical Turkers on their feelings towards different reflection mechanisms
 

 
 I sometimes post alignment-related thinking here .
 

 Contact me at: beth dot m dot surname at gmail.com

 Follow me on Twitter/X .

 
 If you have any feedback for me, I'd love to hear it. You can submit it anonymously (or pseudonymously) here .

Resource ID: f895b1f8c1806c5e | Stable ID: sid_GYUU0eHMgu