Skip to content
Longterm Wiki

[2404.08144] LLM Agents can Autonomously Exploit One-day Vulnerabilities

web

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

This paper demonstrates that GPT-4 agents can autonomously exploit real-world one-day cybersecurity vulnerabilities with 87% success rate, raising significant concerns about dual-use risks of capable LLMs and the need for deployment safeguards.

Metadata

Importance: 78/100arxiv preprintprimary source

Summary

The paper shows that GPT-4-based LLM agents can autonomously exploit 87% of a benchmark of 15 real-world one-day CVE vulnerabilities when given CVE descriptions, vastly outperforming all other tested models and scanners. Without CVE descriptions, performance drops to 7%, indicating the agent is better at exploitation than discovery. These findings raise serious questions about the risks of deploying highly capable LLM agents.

Key Points

  • GPT-4 agents successfully exploited 87% of 15 real-world one-day vulnerabilities when provided CVE descriptions; all other models achieved 0%.
  • Without CVE descriptions, GPT-4's success rate drops to 7%, showing the agent excels at exploitation rather than vulnerability discovery.
  • The exploit agent required only 91 lines of code using the ReAct framework, demonstrating the low barrier to creating such tools.
  • Tested vulnerabilities included critical-severity CVEs spanning websites, container management software, and Python packages.
  • Findings highlight dual-use risks of frontier LLMs and the need for careful governance around deployment of highly capable AI agents.

Cited by 1 page

PageTypeQuality
AI Cyber Damage: Bounding the TailAnalysis--

Cached Content Preview

HTTP 200Fetched May 4, 202643 KB
LLM Agents can Autonomously Exploit One-day Vulnerabilities

 
 
 Richard Fang, Rohan Bindu, Akul Gupta, Daniel Kang
 
 

 
 Abstract

 LLMs have becoming increasingly powerful, both in their benign and malicious
uses. With the increase in capabilities, researchers have been increasingly
interested in their ability to exploit cybersecurity vulnerabilities. In
particular, recent work has conducted preliminary studies on the ability of LLM
agents to autonomously hack websites. However, these studies are limited to
simple vulnerabilities.

 In this work, we show that LLM agents can autonomously exploit one-day
vulnerabilities in real-world systems . To show this, we collected a
dataset of 15 one-day vulnerabilities that include ones categorized as critical
severity in the CVE description. When given the CVE description, GPT-4 is
capable of exploiting 87% of these vulnerabilities compared to 0% for every
other model we test (GPT-3.5, open-source LLMs) and open-source vulnerability
scanners (ZAP and Metasploit). Fortunately, our GPT-4 agent requires the CVE
description for high performance: without the description, GPT-4 can exploit
only 7% of the vulnerabilities. Our findings raise questions around the
widespread deployment of highly capable LLM agents.

 
 
 
 1 Introduction

 
 Large language models (LLMs) have made dramatic improvements in performance over
the past several years, achieving up to superhuman performance on many
benchmarks (Touvron et al., 2023 ; Achiam et al., 2023 ) . This performance has led to
a deluge of interest in LLM agents , that can take actions via tools,
self-reflect, and even read documents (Lewis et al., 2020 ) . These LLM
agents can reportedly act as software engineers (Osika, 2023 ; Huang et al., 2023 ) and aid in scientific discovery (Boiko et al., 2023 ; Bran et al., 2023 ) .

 
 
 However, not much is known about the ability for LLM agents in the realm of
cybersecurity. Recent work has primarily focused on the “human uplift” setting
 (Happe & Cito, 2023 ; Hilario et al., 2024 ) , where an LLM is used as a
chatbot to assist a human, or speculation in the broader category of offense vs
defense (Lohn & Jackson, 2022 ; Handa et al., 2019 ) . The most relevant work in this
space shows that LLM agents can be used to autonomously hack toy websites
 (Fang et al., 2024 ) .

 
 
 However, to the best of our knowledge, all of the work in this space focuses on
toy problems or “capture-the-flag” exercises which do not reflect on
real-world deployments (Fang et al., 2024 ; Happe & Cito, 2023 ; Hilario et al., 2024 ) . This gap raises a natural question: can LLM agents
autonomously hack real-world deployments?

 
 
 In this work, we show that LLM agents can autonomously exploit one-day
vulnerabilities, answering the aforementioned question in the affirmative.

 
 
 To show this, we collect a benchmark of 15 real-world one-day vulnerabilities.
These vulnerabilities were taken from the Common Vulnerabilities and Exposures
(CVE) dat

... (truncated, 43 KB total)
Resource ID: 6c39d224a4a74a56 | Stable ID: sid_EwPyPtcI2Q