first documented AI-orchestrated cyberattack

web

Anthropic·anthropic.com/news/disrupting-AI-espionage

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Anthropic

A landmark real-world incident report from Anthropic documenting the first known AI-orchestrated espionage campaign, directly relevant to agentic AI risks, deployment safety, and the intersection of AI capabilities with national security threats.

Metadata

Importance: 85/100press releaseprimary source

Summary

Anthropic reports detecting a sophisticated September 2025 espionage campaign in which a suspected Chinese state-sponsored group weaponized Claude Code as an autonomous agent to attack roughly thirty global targets including tech companies, financial institutions, and government agencies. This is described as the first documented large-scale cyberattack executed without substantial human intervention, leveraging AI capabilities in intelligence, agency, and tool use. Anthropic responded by banning accounts, notifying victims, coordinating with authorities, and expanding detection capabilities.

Key Points

•First documented case of a large-scale AI-orchestrated cyberattack executed with minimal human intervention, attributed with high confidence to a Chinese state-sponsored group.
•Attackers exploited Claude Code's agentic capabilities—autonomous looping, task chaining, and tool access—to infiltrate ~30 global targets, succeeding in a small number.
•The attack leveraged three converging AI developments: increased model intelligence, agentic autonomy, and broad tool access (e.g., via Model Context Protocol).
•Anthropic's response included real-time account bans, victim notification, coordination with authorities, and development of improved classifiers for detecting distributed agentic attacks.
•The incident signals a new threat paradigm where AI agents dramatically lower the cost and scale of sophisticated cyber operations against critical infrastructure.

Cited by 4 pages

Page	Type	Quality
AI Cyber Damage: Bounding the Tail	Analysis	--
MAIM (Mutually Assured AI Malfunction)	Approach	55.0
Catastrophic Cyber Tail Risk	Risk	--
Cyberweapons Risk	Risk	91.0

3 FactBase facts citing this source

Entity	Property	Value	As Of
Anthropic-Disclosed AI-Orchestrated Campaign (Sept 2025)	Incident Date	Sep 2025	Nov 2025
Anthropic-Disclosed AI-Orchestrated Campaign (Sept 2025)	Financial Impact	$50M	Nov 2025
Anthropic-Disclosed AI-Orchestrated Campaign (Sept 2025)	Status	resolved	Sep 2025

Cached Content Preview

HTTP 200Fetched May 4, 202610 KB

Policy Disrupting the first reported AI-orchestrated cyber espionage campaign

 Nov 13, 2025 Read the report We recently argued that an inflection point had been reached in cybersecurity: a point at which AI models had become genuinely useful for cybersecurity operations, both for good and for ill. This was based on systematic evaluations showing cyber capabilities doubling in six months; we’d also been tracking real-world cyberattacks, observing how malicious actors were using AI capabilities. While we predicted these capabilities would continue to evolve, what has stood out to us is how quickly they have done so at scale.

 In mid-September 2025, we detected suspicious activity that later investigation determined to be a highly sophisticated espionage campaign. The attackers used AI’s “agentic” capabilities to an unprecedented degree—using AI not just as an advisor, but to execute the cyberattacks themselves.

 The threat actor—whom we assess with high confidence was a Chinese state-sponsored group—manipulated our Claude Code tool into attempting infiltration into roughly thirty global targets and succeeded in a small number of cases. The operation targeted large tech companies, financial institutions, chemical manufacturing companies, and government agencies. We believe this is the first documented case of a large-scale cyberattack executed without substantial human intervention.

 Upon detecting this activity, we immediately launched an investigation to understand its scope and nature. Over the following ten days, as we mapped the severity and full extent of the operation, we banned accounts as they were identified, notified affected entities as appropriate, and coordinated with authorities as we gathered actionable intelligence.

 This campaign has substantial implications for cybersecurity in the age of AI “agents”—systems that can be run autonomously for long periods of time and that complete complex tasks largely independent of human intervention. Agents are valuable for everyday work and productivity—but in the wrong hands, they can substantially increase the viability of large-scale cyberattacks.

 These attacks are likely to only grow in their effectiveness. To keep pace with this rapidly-advancing threat, we’ve expanded our detection capabilities and developed better classifiers to flag malicious activity. We’re continually working on new methods of investigating and detecting large-scale, distributed attacks like this one.

 In the meantime, we’re sharing this case publicly, to help those in industry, government, and the wider research community strengthen their own cyber defenses. We’ll continue to release reports like this regularly, and be transparent about the threats we find.

 Read the full report .

 How the cyberattack worked

 The attack relied on several features of AI models that did not exist, or were in much more nascent form, just a year ago:

 Intelligence. Models’ general levels of capability have increased to the poi

... (truncated, 10 KB total)

Resource ID: 4ba107b71a0707f9 | Stable ID: sid_0uw3mbdZNB