Skip to content
Longterm Wiki

[2406.01637] Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

web

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

This paper demonstrates that multi-agent LLM systems (HPTSA) can autonomously exploit real-world zero-day vulnerabilities, raising significant AI safety concerns about dual-use capabilities and the pace at which AI agents can conduct offensive cybersecurity operations without prior knowledge of vulnerabilities.

Metadata

Importance: 78/100arxiv preprintprimary source

Summary

This paper introduces HPTSA, a hierarchical multi-agent LLM framework where a planning agent coordinates specialized subagents to exploit real-world zero-day cybersecurity vulnerabilities. Tested on a benchmark of 15 real-world vulnerabilities past GPT-4's knowledge cutoff, HPTSA achieves 53% pass@5 success rate, outperforming prior single-agent approaches by up to 4.5x and surpassing open-source vulnerability scanners entirely.

Key Points

  • HPTSA uses a hierarchical planning agent that dispatches specialized subagents, resolving context-length and long-range planning limitations of single-agent approaches.
  • The system achieves 53% pass@5 on a benchmark of 15 real-world zero-day vulnerabilities, within 1.4x of a GPT-4 agent given explicit vulnerability descriptions.
  • Open-source vulnerability scanners achieve 0% on the same benchmark, highlighting the significant capability gap introduced by LLM-based agents.
  • The work demonstrates that multi-agent architectures substantially amplify dual-use cybersecurity risks compared to single-agent LLM systems.
  • Benchmark vulnerabilities were selected to be past GPT-4's knowledge cutoff, ensuring genuine zero-day conditions rather than memorized exploits.

Cited by 1 page

PageTypeQuality
AI Cyber Damage: Bounding the TailAnalysis--

Cached Content Preview

HTTP 200Fetched May 4, 202638 KB
Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

 
 
 
Richard Fang, Rohan Bindu, Akul Gupta, Qiusi Zhan, Daniel Kang 
 University of Illinois Urbana-Champaign
 {rrfang2, bindu2, akulg3, qiusiz2, ddkang}@illinois.edu
 
 
 

 
 Abstract

 LLM agents have become increasingly sophisticated, especially in the realm of
cybersecurity. Researchers have shown that LLM agents can exploit real-world
vulnerabilities when given a description of the vulnerability and toy
capture-the-flag problems. However, these agents still perform poorly on
real-world vulnerabilities that are unknown to the agent ahead of time (zero-day
vulnerabilities).

 In this work, we show that teams of LLM agents can exploit real-world,
zero-day vulnerabilities. Prior agents struggle with exploring many different
vulnerabilities and long-range planning when used alone. To resolve this, we
introduce HPTSA , a system of agents with a planning agent that can launch
subagents. The planning agent explores the system and determines which subagents
to call, resolving long-term planning issues when trying different
vulnerabilities. We construct a benchmark of 15 real-world vulnerabilities and
show that our team of agents improve over prior work by up to 4.5 × \times .

 
 
 
 1 Introduction

 
 AI agents are rapidly becoming more capable. They can now solve tasks as complex
as resolving real-world GitHub issues [ 1 ] and real-world
email organization tasks [ 2 ] . However, as their capabilities
for benign applications improve, so does their potential in dual-use settings.

 
 
 Of the dual-use applications, hacking is one of the largest concerns
 [ 3 ] . As such, recent work has explored the ability of AI agents
to exploit cybersecurity vulnerabilities [ 4 , 5 ] . This
work has shown that simple AI agents can autonomously hack mock
“capture-the-flag” style websites and can hack real-world vulnerabilities when
given the vulnerability description. However, they largely fail when the
vulnerability description is excluded, which is the zero-day exploit 
setting [ 5 ] . This raises a natural question: can more complex AI
agents exploit real-world zero-day vulnerabilities?

 
 
 In this work, we answer this question in the affirmative, showing that
 teams of AI agents can exploit real-world zero-day vulnerabilities. To
show this, we develop a novel multi-agent framework for cybersecurity exploits,
extending prior work in the multi-agent setting [ 6 , 7 , 8 ] . We call our technique HPTSA , which (to our
knowledge) is the first multi-agent system to successfully accomplish meaningful
cybersecurity exploits.

 
 
 Prior work uses a single AI agent that explores the computer system (i.e.,
website), plans the attack, and carries out the attack. Because all highly
capable AI agents in the cybersecurity setting at the time of writing are based
on large language models (LLMs), the joint exploration, planning, execution is
challenging for the limited context lengths these agents have.

 
 
 We des

... (truncated, 38 KB total)
Resource ID: 9daf5081c60538c2 | Stable ID: sid_Yw5cIewAPA