[2406.01637] Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
webCredibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: arXiv
This paper demonstrates that multi-agent LLM systems (HPTSA) can autonomously exploit real-world zero-day vulnerabilities, raising significant AI safety concerns about dual-use capabilities and the pace at which AI agents can conduct offensive cybersecurity operations without prior knowledge of vulnerabilities.
Metadata
Summary
This paper introduces HPTSA, a hierarchical multi-agent LLM framework where a planning agent coordinates specialized subagents to exploit real-world zero-day cybersecurity vulnerabilities. Tested on a benchmark of 15 real-world vulnerabilities past GPT-4's knowledge cutoff, HPTSA achieves 53% pass@5 success rate, outperforming prior single-agent approaches by up to 4.5x and surpassing open-source vulnerability scanners entirely.
Key Points
- •HPTSA uses a hierarchical planning agent that dispatches specialized subagents, resolving context-length and long-range planning limitations of single-agent approaches.
- •The system achieves 53% pass@5 on a benchmark of 15 real-world zero-day vulnerabilities, within 1.4x of a GPT-4 agent given explicit vulnerability descriptions.
- •Open-source vulnerability scanners achieve 0% on the same benchmark, highlighting the significant capability gap introduced by LLM-based agents.
- •The work demonstrates that multi-agent architectures substantially amplify dual-use cybersecurity risks compared to single-agent LLM systems.
- •Benchmark vulnerabilities were selected to be past GPT-4's knowledge cutoff, ensuring genuine zero-day conditions rather than memorized exploits.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| AI Cyber Damage: Bounding the Tail | Analysis | -- |
Cached Content Preview
Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
Richard Fang, Rohan Bindu, Akul Gupta, Qiusi Zhan, Daniel Kang
University of Illinois Urbana-Champaign
{rrfang2, bindu2, akulg3, qiusiz2, ddkang}@illinois.edu
Abstract
LLM agents have become increasingly sophisticated, especially in the realm of
cybersecurity. Researchers have shown that LLM agents can exploit real-world
vulnerabilities when given a description of the vulnerability and toy
capture-the-flag problems. However, these agents still perform poorly on
real-world vulnerabilities that are unknown to the agent ahead of time (zero-day
vulnerabilities).
In this work, we show that teams of LLM agents can exploit real-world,
zero-day vulnerabilities. Prior agents struggle with exploring many different
vulnerabilities and long-range planning when used alone. To resolve this, we
introduce HPTSA , a system of agents with a planning agent that can launch
subagents. The planning agent explores the system and determines which subagents
to call, resolving long-term planning issues when trying different
vulnerabilities. We construct a benchmark of 15 real-world vulnerabilities and
show that our team of agents improve over prior work by up to 4.5 × \times .
1 Introduction
AI agents are rapidly becoming more capable. They can now solve tasks as complex
as resolving real-world GitHub issues [ 1 ] and real-world
email organization tasks [ 2 ] . However, as their capabilities
for benign applications improve, so does their potential in dual-use settings.
Of the dual-use applications, hacking is one of the largest concerns
[ 3 ] . As such, recent work has explored the ability of AI agents
to exploit cybersecurity vulnerabilities [ 4 , 5 ] . This
work has shown that simple AI agents can autonomously hack mock
“capture-the-flag” style websites and can hack real-world vulnerabilities when
given the vulnerability description. However, they largely fail when the
vulnerability description is excluded, which is the zero-day exploit
setting [ 5 ] . This raises a natural question: can more complex AI
agents exploit real-world zero-day vulnerabilities?
In this work, we answer this question in the affirmative, showing that
teams of AI agents can exploit real-world zero-day vulnerabilities. To
show this, we develop a novel multi-agent framework for cybersecurity exploits,
extending prior work in the multi-agent setting [ 6 , 7 , 8 ] . We call our technique HPTSA , which (to our
knowledge) is the first multi-agent system to successfully accomplish meaningful
cybersecurity exploits.
Prior work uses a single AI agent that explores the computer system (i.e.,
website), plans the attack, and carries out the attack. Because all highly
capable AI agents in the cybersecurity setting at the time of writing are based
on large language models (LLMs), the joint exploration, planning, execution is
challenging for the limited context lengths these agents have.
We des
... (truncated, 38 KB total)9daf5081c60538c2 | Stable ID: sid_Yw5cIewAPA