Interviewing AI researchers on automation of AI R&D

web

Epoch AI·epoch.ai/blog/interviewing-ai-researchers-on-automation-o...

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Epoch AI

Relevant to AI safety forecasting and concerns about recursive self-improvement; provides empirical grounding on how close AI is to automating its own research and development pipeline.

Metadata

Importance: 62/100interviewanalysis

Summary

Epoch AI conducted qualitative interviews with eight AI researchers to characterize AI R&D workflows, understand disagreements about automation timelines, and evaluate benchmarks for measuring AI capabilities in research tasks. Key finding: engineering tasks (coding, debugging) are the primary near-term driver of R&D automation, and most researchers believe existing engineering-focused evaluations, if solved by AI, would substantially accelerate their work.

Key Points

•Engineering tasks like coding and debugging are more time-consuming than hypothesis generation, making them the central bottleneck for R&D automation in the near term.
•Researcher predictions for full automation timelines vary enormously (years to centuries), but converge on engineering tasks as the key near-term driver.
•6 of 8 researchers said if AI could autonomously solve existing R&D engineering evaluations, a substantial fraction of researcher work hours would be automated.
•Researchers recommended more challenging, open-ended evaluations and fine-grained assessment of AI agent reliability to better track progress.
•The study suggests the key question is when engineering tasks get automated, not which tasks matter most for AI-driven acceleration.

Cited by 1 page

Page	Type	Quality
Self-Improvement and Recursive Enhancement	Capability	69.0

Cached Content Preview

HTTP 200Fetched Apr 9, 202618 KB

Interviewing AI researchers on automation of AI R&D | Epoch AI 

 
 
 
 

 

 
 

 
 
 Introduction

 The question of when and how AI might automate AI R&D is crucial for AI forecasting—if AI could automate the tasks involved in AI research, it could drastically accelerate AI progress . There is a long history of researchers considering this question in the abstract, and describing its importance for how AI will shape the future. 1 However, AI researchers disagree substantially on timelines for automating AI R&D—for instance, researchers’ predictions for when all AI researcher tasks will be automated vary between years and centuries. 2 

 In this work, we interviewed AI researchers with three goals:

 
 Characterize AI R&D work tasks in detail, to better understand how automation might take place.

 Clarify the reasoning underpinning researchers’ predictions about automation, to see where and why they disagree.

 Collect their views on evaluations intended to measure how capable AI systems are at performing AI R&D, to better understand how society can track AI progress in this critical area.

 To do this, we used qualitative interviews. We asked open-ended questions to eight AI researchers across industry, nonprofit and academic labs who have either published at leading conferences, or had similar experience. We identified recurring themes in their answers, and then summarized these, providing example quotations from participants where relevant.

 Participants’ predictions for automation differed greatly, similar to pre-existing survey findings. However, all participants agreed engineering tasks will be the main driver of R&D automation in the next five years. If participants are correct, the question is when engineering tasks will be automated, rather than which tasks are relevant. If AI could solve existing AI R&D evaluations, focused on engineering tasks, most participants believed this would significantly accelerate their work.

 
 Summary of findings

 
 
 Creating hypotheses and planning research are vital for AI R&D, but researchers’ descriptions suggest they occupy relatively little time within a project. Meanwhile, engineering tasks such as coding and debugging are similarly important, but more time-consuming. Engineering tasks are central to participants’ work, even as they become more senior and take on more planning and management responsibilities.

 
 Predictions differ greatly for automation pace, but share a focus on engineering tasks as the driver of R&D automation in the near term. Differences in predictions for automation arise mostly from differences in timelines for when engineering tasks will be automated, rather than differing beliefs about which tasks are relevant.

 
 Existing R&D AI evaluations, focused on engineering, are a promising starting point. Participants gave feedback on example R&D evaluations about implementing ML experiments and debugging. 6/8 participants predicted that if AI could autonomously solve these, a

... (truncated, 18 KB total)

Resource ID: 5eacdec296a81a08 | Stable ID: sid_T3ndSA2L1w