Cyberweapons
Cyberweapons Risk
Comprehensive analysis showing AI-enabled cyberweapons represent a present, high-severity threat with GPT-4 exploiting 87% of one-day vulnerabilities at $8.80/exploit and the first documented AI-orchestrated attack in September 2025 affecting ~30 targets. Key finding: while AI helps both offense and defense, current assessment gives offense a 55-45% offense advantage, with autonomous attacks now comprising 14% of major breaches and causing average U.S. breach costs of $10.22M. Covers five key uncertainties with probability-weighted scenarios.
Quick Assessment
| Dimension | Assessment | Evidence |
|---|---|---|
| Severity | High | Critical infrastructure attacks cost $100K-$10M+ per incident; CDK Global attack cost $1B+ |
| Likelihood | Very High | 87% of organizations experienced AI-driven attacks in 2024; 72% year-over-year increase |
| Timeline | Present | First AI-orchestrated cyberattack documented September 2025; AI already integrated in attack chains |
| Trend | Rapidly Increasing | 14% of breaches now fully autonomous; AI-generated phishing up 67% in 2025 |
| Defense Maturity | Moderate | AI saves defenders $2.2M on average but 90% of companies lack maturity for advanced AI threats |
| Attribution | Decreasing | AI-generated attacks harder to attribute; deepfakes up 2,137% since 2022 |
| International Governance | Weak | First binding AI treaty signed 2024; cyber norms remain largely voluntary |
Overview
AI systems can enhance offensive cyber capabilities in several ways: discovering vulnerabilities in software, generating exploit code, automating attack campaigns, and evading detection. This shifts the offense-defense balance and may enable more frequent, sophisticated, and scalable cyber attacks.
Unlike some AI risks that remain theoretical, AI-assisted cyber attacks are already occurring and advancing rapidly. In 2025, AI-powered cyberattacks surged 72% year-over-year↗🔗 webAI-powered cyberattacks surged 72% year-over-yearA trade magazine statistics roundup useful for quantifying AI misuse trends; not a primary research source, so figures should be verified against original reports before citation in technical or policy contexts.A statistics-focused overview of AI-powered cyberattack trends, reporting a 72% year-over-year surge in AI-enabled attacks. The article covers deepfakes, ransomware escalation, ...cybersecuritycapabilitiesinformation-warfaredeployment+3Source ↗, with 87% of global organizations reporting AI-driven incidents. The first documented AI-orchestrated cyberattack↗🔗 web★★★★☆Anthropicfirst documented AI-orchestrated cyberattackA landmark real-world incident report from Anthropic documenting the first known AI-orchestrated espionage campaign, directly relevant to agentic AI risks, deployment safety, and the intersection of AI capabilities with national security threats.Anthropic reports detecting a sophisticated September 2025 espionage campaign in which a suspected Chinese state-sponsored group weaponized Claude Code as an autonomous agent to...cybersecuritycapabilitiesdeploymentred-teaming+6Source ↗ occurred in September 2025, demonstrating that threat actors can now use AI to execute 80-90% of cyberattack campaigns with minimal human intervention.
The economic impact is substantial. According to IBM's 2025 Cost of a Data Breach Report↗🔗 webIBM's 2025 Cost of a Data Breach ReportTangentially relevant to AI safety in the context of AI-enabled cyberattacks, critical infrastructure protection, and the governance of AI systems that handle sensitive data; primarily a cybersecurity industry benchmarking report.IBM's annual report quantifies the financial and operational costs of data breaches globally, drawing on real-world incidents across industries. It provides benchmarks for breac...cybersecuritycritical-infrastructuregovernancepolicy+3Source ↗, the average U.S. data breach cost reached an all-time high of $10.22 million, while Cybersecurity Ventures projects↗🔗 webCybersecurity Ventures projectsTangentially relevant to AI safety as AI systems increasingly intersect with cybersecurity; useful for understanding the threat environment in which AI-enabled or AI-targeted attacks may occur, but not directly focused on AI alignment or safety.The Cybersecurity Almanac 2025 by Cybersecurity Ventures compiles key statistics, forecasts, and trends in global cybersecurity, including projections on cybercrime costs, workf...cybersecuritygovernancecritical-infrastructurepolicy+3Source ↗ global cybercrime costs will reach $24 trillion by 2027. Roughly 70% of all cyberattacks in 2024 involved critical infrastructure.
Risk Assessment
| Dimension | Assessment | Notes |
|---|---|---|
| Severity | High to Catastrophic | Critical infrastructure attacks can cause cascading failures; ransomware disrupts essential services |
| Likelihood | High | Already occurring at scale; 87% of organizations report AI-driven incidents |
| Timeline | Present | Unlike many AI risks, this concern applies to current systems |
| Trend | Rapidly Increasing | AI capabilities improving; autonomous attacks growing as percentage of incidents |
| Window | Ongoing | Both offense and defense benefit from AI; balance may shift unpredictably |
Responses That Address This Risk
| Response | Mechanism | Effectiveness |
|---|---|---|
| AI Safety Institutes (AISIs) | Government evaluation of AI capabilities | Medium |
| Responsible Scaling Policies | Internal security evaluations before deployment | Medium |
| Compute Governance | Limits access to training resources for offensive AI | Low-Medium |
| Voluntary AI Safety Commitments | Lab pledges on cybersecurity evaluation | Low |
How It Works: The AI-Cyber Threat Mechanism
AI fundamentally changes cybersecurity by enabling attacks at machine speed and scale while potentially outpacing human-centered defenses. Understanding the technical mechanisms helps clarify both the threat and appropriate responses.
Technical Mechanism Overview
AI enhances cyber threats through three primary mechanisms:
- Capability amplification: AI makes existing attack techniques more effective (e.g., phishing emails with perfect grammar, context-aware targeting)
- Speed multiplication: AI operates at timescales impossible for humans (thousands of requests per second, real-time adaptation)
- Scale enablement: AI allows attacks against many targets simultaneously with personalized approaches
The Feedback Loop Problem
A critical concern is the potential for AI-enabled attacks to create negative feedback loops:
Diagram (loading…)
flowchart TD AI_OFFENSE[AI-Enhanced Offense] --> MORE_ATTACKS[More Frequent Attacks] MORE_ATTACKS --> DEFENDER_STRAIN[Defender Strain] DEFENDER_STRAIN --> SLOWER_PATCHING[Slower Patch Cycles] SLOWER_PATCHING --> LARGER_WINDOW[Larger Vulnerability Windows] LARGER_WINDOW --> AI_OFFENSE AI_DEFENSE[AI-Enhanced Defense] --> FASTER_DETECTION[Faster Detection] FASTER_DETECTION --> REDUCED_DWELL[Reduced Dwell Time] REDUCED_DWELL --> SMALLER_IMPACT[Smaller Impact per Attack] style AI_OFFENSE fill:#ffcccc style DEFENDER_STRAIN fill:#ffdddd style AI_DEFENSE fill:#ccffcc style SMALLER_IMPACT fill:#ddffdd
The offense-defense dynamic depends on which feedback loop dominates. Currently, BCG research finds that only 7% of organizations have deployed AI-enabled defenses despite 60% having likely experienced AI-powered attacks—suggesting the offensive feedback loop currently dominates.
Attack Chain Transformation
AI transforms each stage of the cyber attack chain differently:
| Stage | Pre-AI Approach | AI-Enhanced Approach | Speed Increase | Cost Reduction |
|---|---|---|---|---|
| Reconnaissance | Manual OSINT, port scanning | Automated data correlation, pattern recognition | 10-50x | 80-95% |
| Weaponization | Custom exploit development | Automated exploit generation from CVEs | 5-20x | 70-90% |
| Delivery | Generic phishing, spray-and-pray | Personalized, context-aware targeting | 3-10x | 60-80% |
| Exploitation | Manual vulnerability exploitation | Autonomous multi-vector attacks | 100-1000x | 90-99% |
| C2 | Static infrastructure | Adaptive, evasive communication | 5-15x | 50-70% |
| Exfiltration | Bulk data theft | Intelligent data prioritization | 2-5x | 30-50% |
How AI Enhances Cyber Offense
AI enhances cyber offense across the entire attack lifecycle, from initial reconnaissance through exploitation to data exfiltration.
Diagram (loading…)
flowchart TD RECON[Reconnaissance] --> VULN[Vulnerability Discovery] VULN --> EXPLOIT[Exploit Generation] EXPLOIT --> DELIVERY[Phishing/Delivery] DELIVERY --> EXEC[Execution] EXEC --> LATERAL[Lateral Movement] LATERAL --> EXFIL[Data Exfiltration] AI1[AI Automation] --> RECON AI1 --> VULN AI1 --> EXPLOIT AI1 --> DELIVERY AI1 --> EXEC AI1 --> LATERAL AI1 --> EXFIL style AI1 fill:#ffcccc style EXFIL fill:#ffdddd style RECON fill:#e6f3ff style VULN fill:#e6f3ff
AI Capability Assessment by Attack Phase
| Attack Phase | AI Capability Level | Key Metrics | Human Comparison |
|---|---|---|---|
| Vulnerability Discovery | Very High | GPT-4 exploits 87% of one-day vulnerabilities | 10-15x faster than manual analysis |
| Exploit Generation | High | Working exploits generated in 10-15 minutes at $1/exploit | Days to weeks for human researchers |
| Phishing/Social Engineering | Very High | 82.6% of phishing emails now use AI; 54% click-through vs 12% without AI | 4.5x more effective; 50x more profitable |
| Attack Automation | High | Thousands of requests per second; 80-90% of campaigns automated | Physically impossible for humans to match |
| Evasion | Moderate-High | 41% of ransomware includes AI modules for adaptive behavior | Real-time adaptation to defenses |
| Attribution Evasion | High | AI-generated attacks harder to attribute; deepfakes up 2,137% | Unprecedented obfuscation capability |
Vulnerability Discovery
Research from the University of Illinois↗🔗 webResearch from the University of IllinoisKey empirical evidence that frontier LLMs lower the barrier to cyberattacks; relevant to AI risk assessments, deployment policy debates, and discussions of capability thresholds for dangerous use.This IBM Think article summarizes University of Illinois research demonstrating that GPT-4 can autonomously exploit 87% of 'one-day' (recently disclosed but unpatched) cybersecu...capabilitiesred-teamingcybersecuritydeployment+4Source ↗ found that GPT-4 can successfully exploit 87% of one-day vulnerabilities when provided with CVE descriptions. The AI agent required only 91 lines of code, and researchers calculated the cost of successful attacks at just $8.80 per exploit. Without CVE descriptions, success dropped to 7%—an 80% decrease—highlighting that current AI excels at exploiting disclosed vulnerabilities rather than discovering novel ones.
More recent research demonstrates AI systems can generate working exploits for published CVEs in just 10-15 minutes↗🔗 webAI systems can generate working exploits for published CVEs in just 10-15 minutesRelevant to AI safety discussions around dual-use capabilities and deployment risks; illustrates how frontier AI coding abilities can rapidly translate into real-world offensive cyber threats, informing debates on capability disclosure and model deployment safeguards.Research demonstrates that AI systems, particularly large language models, can autonomously generate functional exploit code for known CVE vulnerabilities in as little as 10-15 ...capabilitiesred-teamingcybersecuritydeployment+4Source ↗ at approximately $1 per exploit. This dramatically accelerates exploitation compared to manual human analysis.
OpenAI announced Aardvark↗🔗 web★★★★☆OpenAIOpenAI announced AardvarkRelevant to AI safety discussions around dual-use capabilities: agentic AI systems that autonomously find and exploit software vulnerabilities could significantly shift offensive/defensive balances in cybersecurity, raising questions about deployment safeguards and misuse potential.OpenAI introduced Aardvark, an autonomous AI security research agent powered by GPT-5 that continuously analyzes codebases to discover vulnerabilities, validate exploitability i...capabilitiesdeploymentred-teamingcybersecurity+3Source ↗, an agentic security researcher powered by GPT-5, designed to help developers discover and fix vulnerabilities at scale. Aardvark has discovered vulnerabilities in open-source projects, with ten receiving CVE identifiers—demonstrating that AI can find novel vulnerabilities, not just exploit known ones.
Exploit Development
AI can help write malware, generate phishing content, and automate attack code. Language models produce functional exploit code for known vulnerabilities and can assist with novel exploit development.
A security researcher demonstrated creating a fully AI-generated exploit for CVE-2025-32433↗🔗 webdemonstrated creating a fully AI-generated exploit for CVE-2025-32433A concrete case study relevant to AI uplift in offensive cybersecurity; illustrates how frontier LLMs can accelerate exploit development, a key concern in AI risk and governance discussions around dual-use capabilities.A security researcher demonstrates using GPT-4 to autonomously generate a functional exploit for a critical Erlang/OTP SSH vulnerability (CVE-2025-32433) before any public proof...capabilitiesred-teamingcybersecuritydeployment+2Source ↗ before any public proof-of-concept existed—going from a tweet about the vulnerability to a working exploit with no prior public code.
Attack Automation
AI can manage many simultaneous attacks, adapt to defenses in real-time, and operate at speeds humans cannot match. The Anthropic disclosure↗🔗 web★★★★☆Anthropicfirst documented AI-orchestrated cyberattackA landmark real-world incident report from Anthropic documenting the first known AI-orchestrated espionage campaign, directly relevant to agentic AI risks, deployment safety, and the intersection of AI capabilities with national security threats.Anthropic reports detecting a sophisticated September 2025 espionage campaign in which a suspected Chinese state-sponsored group weaponized Claude Code as an autonomous agent to...cybersecuritycapabilitiesdeploymentred-teaming+6Source ↗ noted that during the September 2025 attack, the AI made thousands of requests, often multiple per second—"an attack speed that would have been, for human hackers, simply impossible to match."
Autonomous ransomware, capable of lateral movement without human oversight, was present in 19% of breaches in 2025. Additionally, 41% of all active ransomware families now include some form of AI module for adaptive behavior.
Social Engineering
AI has transformed phishing and social engineering at scale:
- 82.6% of phishing emails now use AI in some form
- Microsoft research↗🔗 web★★★★☆MicrosoftMicrosoft Digital Defense Report 2025This annual industry report from Microsoft is relevant to AI safety discussions around dual-use AI capabilities, the offensive use of AI by threat actors, and governance challenges in securing AI-enabled infrastructure at scale.Microsoft's 2025 Digital Defense Report analyzes the current cyber threat landscape, highlighting how AI is accelerating both offensive and defensive capabilities. It documents ...cybersecuritycapabilitiesgovernancepolicy+4Source ↗ found AI-automated phishing emails achieved 54% click-through rates compared to 12% for non-AI phishing (4.5x more effective)
- AI can make phishing operations up to 50x more profitable by scaling targeted attacks
- Voice cloning attacks increased 81% in 2025
- AI-driven forgeries grew 195% globally, with techniques now convincing enough to defeat selfie checks and liveness tests
Current State
AI is already integrated into both offensive and defensive cybersecurity. Commercial security products use AI for threat detection. Offensive tools increasingly incorporate AI assistance. State actors are investing heavily in AI cyber capabilities.
2025 Attack Statistics
| Metric | Value | Change | Source |
|---|---|---|---|
| AI-powered attack growth | 72% year-over-year | +72% from 2024 | SQ Magazine↗🔗 webAI-powered cyberattacks surged 72% year-over-yearA trade magazine statistics roundup useful for quantifying AI misuse trends; not a primary research source, so figures should be verified against original reports before citation in technical or policy contexts.A statistics-focused overview of AI-powered cyberattack trends, reporting a 72% year-over-year surge in AI-enabled attacks. The article covers deepfakes, ransomware escalation, ...cybersecuritycapabilitiesinformation-warfaredeployment+3Source ↗ |
| Organizations reporting AI incidents | 87% | — | Industry surveys |
| Fully autonomous breaches | 14% of major corporate breaches | New category | 2025 analysis |
| AI-generated phishing emails | 67% increase | +67% from 2024 | All About AI↗🔗 webAI Cyberattack Statistics and TrendsA statistics aggregation page relevant to understanding AI misuse risks; useful for grounding discussions of AI-enabled threats with empirical data, though depth and sourcing quality should be verified for academic use.A statistical resource compiling data on AI-enabled cyberattacks, covering the frequency, scale, and impact of AI-assisted malicious activities. It provides quantitative insight...cybersecuritycapabilitiesdeploymentgovernance+4Source ↗ |
| Deepfake incidents Q1 2025 | 179 recorded | More than all of 2024 | Deepstrike↗🔗 webAI Cyber Attack Statistics 2025: Trends, Costs, and DefenseIndustry-facing statistics aggregator useful for understanding the near-term misuse landscape of AI capabilities; relevant for AI safety researchers tracking dual-use risks and policymakers assessing AI-enabled threat escalation.A data-rich 2025 compilation of statistics on AI-enabled cyberattacks, covering attack trends, breach costs, exposed AI infrastructure, and defensive playbooks. It synthesizes d...cybersecuritycapabilitiesdeploymentred-teaming+4Source ↗ |
| Average U.S. data breach cost | $10.22 million | +9% from 2024 | IBM↗🔗 webIBM's 2025 Cost of a Data Breach ReportTangentially relevant to AI safety in the context of AI-enabled cyberattacks, critical infrastructure protection, and the governance of AI systems that handle sensitive data; primarily a cybersecurity industry benchmarking report.IBM's annual report quantifies the financial and operational costs of data breaches globally, drawing on real-world incidents across industries. It provides benchmarks for breac...cybersecuritycritical-infrastructuregovernancepolicy+3Source ↗ |
The gap between AI-assisted and fully autonomous attacks is closing rapidly. In 2025, 14% of major corporate breaches were fully autonomous, meaning no human hacker intervened after the AI launched the attack. However, AI models still experience significant limitations—during the September 2025 attack, Claude "frequently 'hallucinated' during autonomous operations, claiming to have stolen credentials that did not work or labeling publicly available data as 'high-value discoveries.'"
Offense-Defense Balance
A key question is whether AI helps offense or defense more. Recent research provides nuanced answers:
Research on the Offense-Defense Balance
| Report | Organization | Key Finding |
|---|---|---|
| Tipping the Scales↗🔗 web★★★★☆CNASTipping the Scales: How Emerging AI Capabilities Could Disrupt the Cyber Offense-Defense BalancePublished by the Center for a New American Security (CNAS), this report is relevant for researchers and policymakers examining dual-use AI risks, particularly how offensive AI capabilities intersect with national security and critical infrastructure protection.This CNAS report examines how advancing AI capabilities may shift the balance between cyber offense and defense, potentially giving attackers new advantages in exploiting vulner...cybersecuritygovernancepolicycapabilities+4Source ↗ | CNAS (Sept 2025) | AI capabilities have historically benefited defenders, but future frontier models could tip scales toward attackers |
| Anticipating AI's Impact↗🔗 web★★★★☆CSET GeorgetownAnticipating AI's ImpactA 2025 CSET policy-analytical report relevant to AI safety practitioners concerned with dual-use AI capabilities, cyber threat landscapes, and governance of AI in national security contexts.This CSET report by Andrew Lohn (May 2025) analyzes how AI will reshape the cybersecurity offense-defense balance across five domains: digital ecosystem changes, environment har...cybersecuritycapabilitiespolicygovernance+4Source ↗ | Georgetown CSET (May 2025) | Many ways AI helps both sides; defenders can take specific actions to tilt odds in their favor |
| Implications of AI in Cybersecurity↗🔗 webImplications of AI in CybersecurityPublished by the Institute for Security and Technology, this report is relevant to AI safety researchers and policymakers concerned with how advancing AI capabilities intersect with cybersecurity threats, particularly around critical infrastructure and dual-use risks.This report from the Institute for Security and Technology examines how AI is transforming the cybersecurity landscape, analyzing both offensive and defensive applications. It e...cybersecuritygovernancepolicycapabilities+6Source ↗ | IST (May 2025) | Puts forward 7 priority recommendations for maintaining defense advantage |
Arguments for offense advantage:
- Attacks only need to find one vulnerability; defense must protect everything
- AI accelerates the already-faster attack cycle—median time-to-exploitation in 2024 was 192 days, expected to shrink with AI
- Scaling attacks is easier than scaling defenses (thousands of simultaneous targets vs. point defenses)
- 90% of companies lack maturity to counter advanced AI-enabled threats
Arguments for defense advantage:
- Defenders have more data about their own systems
- Detection can leverage AI for anomaly identification
- According to IBM↗🔗 webIBM's 2025 Cost of a Data Breach ReportTangentially relevant to AI safety in the context of AI-enabled cyberattacks, critical infrastructure protection, and the governance of AI systems that handle sensitive data; primarily a cybersecurity industry benchmarking report.IBM's annual report quantifies the financial and operational costs of data breaches globally, drawing on real-world incidents across industries. It provides benchmarks for breac...cybersecuritycritical-infrastructuregovernancepolicy+3Source ↗, companies using AI extensively in security save an average $1.2 million and reduce breach lifecycle by 80 days
- More than 80% of major companies now use AI for cyber defense
The balance likely varies by context and over time. The Georgetown CSET report↗🔗 web★★★★☆CSET GeorgetownGeorgetown CSET reportA CSET policy brief providing structured analysis of AI's dual-use implications for cybersecurity, relevant to AI governance discussions around offensive AI capabilities, deployment standards, and national security policy.This CSET policy brief by Andrew Lohn analyzes how varying levels of AI advancement may shift the balance between cyber offense and defense across five categories: digital ecosy...cybersecuritygovernancepolicycapabilities+5Source ↗ notes that "the current AI-for-cybersecurity paradigm focuses on detection using automated tools, but it has largely neglected holistic autonomous cyber defense systems—ones that can act without human tasking."
Systemic Risks
Beyond individual attacks, AI-enabled cyber capabilities create systemic risks. Critical infrastructure becomes more vulnerable as attacks grow more frequent and sophisticated. Cyber conflict between nations could escalate faster than human decision-makers can manage. The proliferation of offensive AI tools enables non-state threats at state-level capability.
Critical Infrastructure Under Attack
Roughly 70% of all cyberattacks in 2024 involved critical infrastructure↗🔗 webRoughly 70% of all cyberattacks in 2024 involved critical infrastructureRelevant background for understanding how cyberattacks on critical infrastructure create systemic risks; useful context for AI safety discussions around autonomous systems, AI-enabled attacks, and the governance of AI in high-stakes operational environments.This resource reports that approximately 70% of cyberattacks in 2024 were directed at critical infrastructure sectors, highlighting the growing threat landscape for essential se...cybersecuritycritical-infrastructureinformation-warfaregovernance+4Source ↗, with global critical infrastructure facing over 420 million cyberattacks. An estimated 40% of all cyberattacks are now AI-driven.
| Sector | 2024 Attack Metrics | Key Incidents |
|---|---|---|
| Healthcare | 14.2% of all critical infrastructure attacks; 2/3 suffered ransomware | Change Healthcare breach affected 100M Americans; Ascension Health 5.6M patients |
| Utilities/Power Grid | 1,162 attacks (+70% from 2023); 234% Q3 increase | Forescout found 46 new solar infrastructure vulnerabilities |
| Water Systems | Multiple breaches using same methodology | American Water (14M customers) portal shutdown; Aliquippa booster station compromised |
| Financial/Auto | Cascading supply chain attacks | CDK Global attack cost $1B+; disrupted 15,000 dealerships |
The CISA Roadmap for AI↗🏛️ government★★★★☆CISACISA Roadmap for AI (2023-2024)Official U.S. government strategic document from CISA outlining how the nation's cyber defense agency plans to manage AI risks and opportunities for critical infrastructure; now archived but relevant as a reference for U.S. federal AI governance approaches.CISA's whole-of-agency strategic plan for addressing AI in the context of national cybersecurity and critical infrastructure protection. It outlines efforts to promote beneficia...governancepolicyai-safetydeployment+4Source ↗ identifies three categories of AI risk to critical infrastructure: adversaries leveraging AI to execute attacks, AI used to plan attacks, and AI used to enhance attack effectiveness.
Economic Impact
| Metric | Value | Context |
|---|---|---|
| Average U.S. data breach cost | $10.22 million | All-time high; +9% from 2024 |
| Global average breach cost | $4.44 million | Down 9% from $4.88M in 2024 |
| CDK Global ransomware losses | $1.02 billion | 15,000 dealerships affected for 2+ weeks |
| Projected global cybercrime cost (2027) | $24 trillion | Cybersecurity Ventures↗🔗 webCybersecurity Ventures projectsTangentially relevant to AI safety as AI systems increasingly intersect with cybersecurity; useful for understanding the threat environment in which AI-enabled or AI-targeted attacks may occur, but not directly focused on AI alignment or safety.The Cybersecurity Almanac 2025 by Cybersecurity Ventures compiles key statistics, forecasts, and trends in global cybersecurity, including projections on cybercrime costs, workf...cybersecuritygovernancecritical-infrastructurepolicy+3Source ↗ |
| Critical infrastructure attack financial impact | 45% report $500K+ losses; 27% report $1M+ | Claroty study |
| Shadow AI incident cost premium | +$200,000 per breach | Takes longer to detect and contain |
According to IBM's 2025 report↗🔗 webIBM's 2025 Cost of a Data Breach ReportTangentially relevant to AI safety in the context of AI-enabled cyberattacks, critical infrastructure protection, and the governance of AI systems that handle sensitive data; primarily a cybersecurity industry benchmarking report.IBM's annual report quantifies the financial and operational costs of data breaches globally, drawing on real-world incidents across industries. It provides benchmarks for breac...cybersecuritycritical-infrastructuregovernancepolicy+3Source ↗, 13% of organizations reported breaches of AI models or applications, with 97% of those lacking proper AI access controls. Shadow AI (unauthorized AI tools) was involved in 20% of breaches.
Case Studies
First AI-Orchestrated Cyberattack (September 2025)
In mid-September 2025, Anthropic detected and disrupted↗🔗 web★★★★☆Anthropicfirst documented AI-orchestrated cyberattackA landmark real-world incident report from Anthropic documenting the first known AI-orchestrated espionage campaign, directly relevant to agentic AI risks, deployment safety, and the intersection of AI capabilities with national security threats.Anthropic reports detecting a sophisticated September 2025 espionage campaign in which a suspected Chinese state-sponsored group weaponized Claude Code as an autonomous agent to...cybersecuritycapabilitiesdeploymentred-teaming+6Source ↗ what they assessed as a Chinese state-sponsored attack using Claude's "agentic" capabilities. This is considered the first documented case of a large-scale cyberattack executed without substantial human intervention.
Key details:
- Threat actor designated GTG-1002, assessed with high confidence as Chinese state-sponsored
- Targeted approximately 30 global entities including large tech companies, financial institutions, chemical manufacturing companies, and government agencies
- 4 successful breaches confirmed
- AI executed 80-90% of tactical operations independently, including reconnaissance, exploitation, credential harvesting, lateral movement, and data exfiltration
- Attack speeds of thousands of requests per second—"physically impossible for human hackers to match"
How the attack worked: The attackers jailbroke Claude by breaking attacks into small, seemingly innocent tasks that Claude executed without full context of their malicious purpose. According to Anthropic↗🔗 web★★★★☆AnthropicAccording to AnthropicAn Anthropic report on a real-world case of AI being used to orchestrate cyber espionage; highly relevant to AI misuse, deployment safeguards, and the governance of dual-use AI capabilities.This Anthropic report documents the identification and disruption of what is described as the first known cyber espionage campaign orchestrated using AI systems. It analyzes how...ai-safetyred-teamingdeploymentgovernance+4Source ↗, the threat actor "convinced Claude—which is extensively trained to avoid harmful behaviors—to engage in the attack" through this compartmentalization technique.
Limitations observed: Claude frequently "hallucinated" during operations, claiming to have stolen credentials that did not work or labeling publicly available data as "high-value discoveries." Human operators still had to verify AI-generated findings.
CDK Global Ransomware (June 2024)
On June 18, 2024, the BlackSuit ransomware group attacked CDK Global↗🔗 webBlackSuit ransomware group attacked CDK GlobalThis article from cybersecurity vendor BlackFog covers a 2024 ransomware incident relevant to critical infrastructure protection and supply-chain risk; it is tangential to core AI safety but illustrates real-world cyber threats to dependent systems.In June 2024, the BlackSuit ransomware group attacked CDK Global, a software provider serving over 15,000 North American auto dealerships, encrypting critical systems and demand...cybersecuritycritical-infrastructureinformation-warfaredeployment+2Source ↗, a leading software provider for the automotive industry. The attack affected approximately 15,000 car dealerships in the U.S. and Canada.
Impact:
- Total dealer losses: $1.02 billion (Anderson Economic Group↗🔗 webAnderson Economic GroupUseful as a real-world case study of economic harm from cyberattacks on concentrated software infrastructure; tangentially relevant to AI safety discussions around deployment risks, systemic dependencies, and critical infrastructure vulnerabilities.This resource covers the Anderson Economic Group's analysis estimating that the 2024 CDK Global cyberattack caused approximately $1 billion in losses to automotive dealerships. ...cybersecuritycritical-infrastructuregovernancepolicy+2Source ↗ estimate)
- Ransom demand escalated from $10 million to over $50 million
- CDK reportedly paid $25 million in bitcoin↗🔗 web★★★☆☆CNNCDK reportedly paid \$25 million in bitcoinThis CNN news article covers a real-world ransomware case relevant to understanding systemic risks from cyberattacks on critical software infrastructure, though it has limited direct relevance to core AI safety research.CDK Global, a software provider serving thousands of US auto dealerships, reportedly paid approximately $25 million in bitcoin to BlackSuit ransomware hackers following a June 2...cybersecuritycritical-infrastructuregovernancepolicy+1Source ↗ on June 21
- Services restored by July 4 after nearly two weeks of disruption
- 7.2% decline in total new-vehicle sales in June 2024
A second cyberattack on June 19 during recovery efforts further delayed restoration. Major dealership companies including Lithia Motors, Group 1 Automotive, Penske Automotive Group, and Sonic Automotive reported disruptions to the SEC.
Change Healthcare Attack (February 2024)
The BlackCat/ALPHV ransomware group attacked Change Healthcare, taking down payment systems for several days.
Impact:
- 100 million Americans affected—the largest healthcare breach on record
- UnitedHealth confirmed the breach scope in late 2024
- Demonstrated cascading effects across the healthcare supply chain
AI-Enhanced Phishing at Scale
Security firm Memcyco documented a global bank facing approximately 18,500 Account Takeover incidents annually from AI-driven phishing campaigns, costing an estimated $27.75 million. After deploying AI defenses, incidents dropped 65%.
Ivanti Zero-Day Exploits (2024)
Chinese nation-state actors exploited Ivanti VPN products for espionage, impacting government and telecom sectors. Analysis suggests AI likely enhanced attack efficiency in vulnerability discovery and exploitation.
Key Debates
Crux 1: Does AI Favor Offense or Defense?
If offense advantage: Urgent need for defensive AI investment, international agreements, and perhaps restrictions on offensive AI development. Attackers could gain persistent advantage.
If defense advantage: Focus on AI adoption for security operations; maintain current governance approach. Natural market forces will drive defensive innovation.
| Evidence | Favors Offense | Favors Defense |
|---|---|---|
| 87% of organizations hit by AI attacks | Strong | — |
| 90% of companies lack AI threat maturity | Strong | — |
| $1.2M savings with AI-powered defense | — | Strong |
| 80% of companies now use AI for defense | — | Moderate |
| Autonomous malware in 41% of ransomware | Moderate | — |
| Current Assessment | Moderate advantage (55%) | 45% |
Crux 2: How Fast Are Autonomous Capabilities Developing?
If rapid development: The September 2025 attack may be the beginning of a new paradigm where AI-orchestrated attacks become routine. Governance may not keep pace.
If gradual development: Time exists to develop norms, improve defenses, and implement guardrails. The "hallucination" problem suggests fundamental limitations.
Crux 3: Will International Governance Emerge?
If effective governance develops: Attribution frameworks, rules of engagement, and enforcement mechanisms could constrain AI cyberweapon development.
If governance fails: Cyber arms race accelerates; non-state actors gain access to state-level capabilities; critical infrastructure increasingly vulnerable.
Current status: The first binding international AI treaty↗🔗 webfirst binding international AI treatyPublished by the R Street Institute, a free-market policy think tank; relevant for researchers tracking international AI governance frameworks and how cybersecurity concerns are being embedded into binding multilateral AI agreements.This R Street Institute commentary analyzes the cybersecurity provisions of the Council of Europe's Framework Convention on Artificial Intelligence, the first legally binding in...governancepolicycoordinationdeployment+4Source ↗ was signed in September 2024 by the U.S. and 9 other countries, but enforcement mechanisms are limited. Cyber norms remain largely voluntary through frameworks like the Paris Call for Trust and Security in Cyberspace↗🔗 webParis Call for Trust and Security in CyberspacePublished by the Paris Peace Forum in early 2025, this report is relevant for AI governance researchers exploring how precedents from cybersecurity multilateralism might accelerate the development of international AI safety institutions and norms.This Paris Peace Forum report examines how existing cybersecurity governance frameworks, particularly the Paris Call for Trust and Security in Cyberspace, can serve as a bluepri...governancepolicycoordinationai-safety+4Source ↗.
Crux 4: How Much Autonomy Should Defensive AI Have?
If high autonomy: Faster response to threats operating at machine speed. But autonomous defensive systems could escalate conflicts or cause unintended damage (e.g., misidentifying legitimate traffic as attacks).
If human-in-the-loop: Better control and accountability, but response times may be too slow against AI-powered attacks executing thousands of actions per second.
Key Uncertainties
The following uncertainties significantly affect both the magnitude of AI cyberweapon risks and the optimal policy response.
Uncertainty 1: AI Capability Trajectory for Autonomous Exploitation
Current state: GPT-4 can exploit 87% of one-day vulnerabilities with CVE descriptions, but only 7% without them. The September 2025 attack demonstrated 80-90% autonomous operation but still required human verification of AI-generated findings.
Range of outcomes:
- Conservative (30% probability): AI capabilities plateau due to fundamental limitations in reasoning about novel vulnerabilities. Autonomous exploitation remains limited to known vulnerability classes.
- Moderate (50% probability): Steady improvement enables AI to discover and exploit zero-day vulnerabilities within 2-3 years, but with significant hallucination rates requiring human oversight.
- Aggressive (20% probability): Rapid capability gains enable fully autonomous exploit chains including novel zero-day discovery by 2027, fundamentally changing the threat landscape.
Key indicators to watch: Success rates on zero-day discovery benchmarks; reduction in AI hallucination rates during security operations; time from vulnerability disclosure to weaponized exploit.
Uncertainty 2: Offense-Defense Balance Equilibrium
Current state: BCG surveys indicate 60% of organizations have likely experienced AI-powered attacks, but only 7% have deployed AI-enabled defenses. This suggests a temporary offense advantage due to adoption lag rather than fundamental asymmetry.
Range of outcomes:
- Offense wins (25% probability): Attacker advantages compound—automation enables simultaneous attacks at scale while defenses remain fragmented. Critical infrastructure becomes increasingly vulnerable.
- Equilibrium (45% probability): Both sides benefit roughly equally; the current advantage oscillates based on innovation cycles. Security improves overall but so does threat sophistication.
- Defense wins (30% probability): Defensive AI eventually gains structural advantages through better data access, legitimate infrastructure, and economies of scale. Attack success rates decline over time.
Key cruxes: Whether AI-powered threat detection achieves accuracy rates above 95% while maintaining low false positive rates; whether autonomous defense systems can respond at machine speed without causing collateral damage; whether international coordination enables faster threat intelligence sharing.
Uncertainty 3: Proliferation of Offensive AI Tools
Current state: Advanced offensive AI capabilities remain concentrated among nation-state actors and sophisticated criminal groups. The September 2025 attack was attributed to a state-sponsored actor (GTG-1002, assessed as Chinese state-sponsored).
Range of outcomes:
- Limited proliferation (35% probability): Offensive AI capabilities remain difficult to develop; nation-states maintain dominance; non-state actors limited to using commoditized tools.
- Moderate proliferation (45% probability): Ransomware-as-a-service providers integrate AI capabilities; criminal groups gain access to sophisticated tools; attacks increase in frequency but remain somewhat contained.
- Widespread proliferation (20% probability): Open-source offensive AI tools become widely available; attack capabilities democratize rapidly; even low-sophistication actors can execute advanced attacks.
Key indicators: Dark web availability of AI-enhanced attack tools; diversity of threat actors conducting autonomous attacks; price trends for offensive AI capabilities in underground markets.
Uncertainty 4: International Governance Effectiveness
Current state: The Council of Europe Framework Convention on AI (signed September 2024) is the first binding international AI treaty, but major cyber powers (China, Russia) are not signatories. Cyber norms remain largely voluntary.
Range of outcomes:
- Weak governance (40% probability): No effective international framework emerges; cyber arms race accelerates; attribution remains contested; norms are routinely violated without consequence.
- Partial governance (45% probability): Limited agreements among like-minded nations; some red lines established (e.g., no attacks on hospitals, nuclear facilities); enforcement remains inconsistent.
- Strong governance (15% probability): Comprehensive international framework emerges; effective attribution mechanisms; meaningful enforcement through coordinated sanctions or countermeasures.
Key developments to watch: UN Group of Governmental Experts progress on lethal autonomous weapons (next sessions in 2025); expansion of signatories to existing treaties; establishment of international cyber attribution bodies.
Uncertainty 5: Critical Infrastructure Resilience
Current state: Roughly 70% of all cyberattacks in 2024 involved critical infrastructure, with 45% of affected organizations reporting losses exceeding $500,000. However, segmentation and air-gapping provide some protection for operational technology systems.
Range of outcomes:
- Declining resilience (30% probability): IT/OT convergence increases attack surface; legacy systems remain vulnerable; cascading failures become more likely as systems become more interconnected.
- Stable resilience (50% probability): Investment in defensive capabilities roughly matches increasing threat sophistication; major incidents remain possible but catastrophic cascading failures are avoided.
- Improving resilience (20% probability): Significant defensive investment, improved segmentation, and AI-powered monitoring substantially reduce successful attacks on critical infrastructure.
Key factors: Rate of IT/OT convergence; investment in critical infrastructure cybersecurity; effectiveness of regulatory mandates (e.g., CISA's Cybersecurity Performance Goals 2.0).
Summary: Uncertainty Impact Matrix
| Uncertainty | Low Estimate | Central Estimate | High Estimate | Decision Relevance |
|---|---|---|---|---|
| AI capability trajectory | Plateau at current levels | 2-3x improvement by 2028 | 10x improvement by 2027 | Very High |
| Offense-defense balance | Defense wins long-term | Rough parity | Persistent offense advantage | High |
| Tool proliferation | Limited to state actors | Moderate criminal access | Widespread democratization | High |
| International governance | Largely ineffective | Partial frameworks | Comprehensive regime | Medium |
| Infrastructure resilience | Declining | Stable | Improving | Medium-High |
Timeline
| Date | Event | Significance |
|---|---|---|
| 2020 | First documented AI-assisted vulnerability discovery tools deployed | AI enters offensive security tooling |
| 2023 (Nov) | CISA releases AI Roadmap | Whole-of-agency plan for AI security |
| 2024 (Jan) | CISA completes initial AI risk assessments for critical infrastructure | First systematic government evaluation |
| 2024 (Feb) | Change Healthcare ransomware attack | 100M Americans affected; largest healthcare breach |
| 2024 (Apr) | University of Illinois research shows GPT-4 exploits 87% of vulnerabilities | First rigorous academic measurement of AI exploit capability |
| 2024 (Apr) | DHS publishes AI-CI safety guidelines↗🏛️ governmentAI-CI safety guidelinesAn official U.S. government document issued under EO 14110; useful for understanding how federal agencies are translating AI safety principles into actionable guidelines for high-stakes infrastructure operators.The Department of Homeland Security's April 2024 guidelines provide a four-part framework (Govern, Map, Measure, Manage) to help critical infrastructure owners and operators man...ai-safetygovernancepolicydeployment+6Source ↗ | Federal critical infrastructure protection guidance |
| 2024 (Jun) | CDK Global ransomware attack | $1B+ losses; 15,000 dealerships disrupted |
| 2024 (Sep) | First binding international AI treaty signed | U.S. and 9 countries; Council of Europe Framework Convention↗🔗 webfirst binding international AI treatyPublished by the R Street Institute, a free-market policy think tank; relevant for researchers tracking international AI governance frameworks and how cybersecurity concerns are being embedded into binding multilateral AI agreements.This R Street Institute commentary analyzes the cybersecurity provisions of the Council of Europe's Framework Convention on Artificial Intelligence, the first legally binding in...governancepolicycoordinationdeployment+4Source ↗ |
| 2024 (Oct) | American Water cyberattack | 14M customers affected |
| 2025 (Mar) | Microsoft Security Copilot agents↗🔗 web★★★★☆MicrosoftMicrosoft Security Copilot agentsThis Microsoft blog post is relevant to AI safety discussions around agentic AI deployment in high-stakes security contexts, autonomous decision-making, and the governance challenges of AI systems operating with reduced human oversight in critical infrastructure defense.Microsoft announces the next evolution of Security Copilot, introducing AI agents designed to autonomously handle critical security tasks such as phishing triage, data security,...ai-safetydeploymentcapabilitiesgovernance+4Source ↗ unveiled | AI-powered autonomous defense tools |
| 2025 (May) | Georgetown CSET and IST release offense-defense balance reports | Academic frameworks for understanding AI cyber dynamics |
| 2025 (May) | CISA releases AI data security guidance↗🔗 webAI data security guidanceThis CISA guidance is a U.S. federal policy document relevant to practitioners deploying AI in high-stakes or critical infrastructure settings, complementing broader AI governance frameworks with a cybersecurity-focused lens.This resource covers guidance released by the Cybersecurity and Infrastructure Security Agency (CISA) on securing data used in AI systems. It addresses best practices for protec...governancepolicydeploymentai-safety+3Source ↗ | Best practices for AI system operators |
| 2025 (Sep) | First AI-orchestrated cyberattack↗🔗 web★★★★☆Anthropicfirst documented AI-orchestrated cyberattackA landmark real-world incident report from Anthropic documenting the first known AI-orchestrated espionage campaign, directly relevant to agentic AI risks, deployment safety, and the intersection of AI capabilities with national security threats.Anthropic reports detecting a sophisticated September 2025 espionage campaign in which a suspected Chinese state-sponsored group weaponized Claude Code as an autonomous agent to...cybersecuritycapabilitiesdeploymentred-teaming+6Source ↗ detected (Anthropic) | 30 targets; 4 successful breaches; 80-90% autonomous |
| 2025 (Oct) | Microsoft Digital Defense Report 2025↗🔗 web★★★★☆MicrosoftMicrosoft Digital Defense Report 2025This annual industry report from Microsoft is relevant to AI safety discussions around dual-use AI capabilities, the offensive use of AI by threat actors, and governance challenges in securing AI-enabled infrastructure at scale.Microsoft's 2025 Digital Defense Report analyzes the current cyber threat landscape, highlighting how AI is accelerating both offensive and defensive capabilities. It documents ...cybersecuritycapabilitiesgovernancepolicy+4Source ↗ | Comprehensive analysis of AI-driven threat landscape |
| 2025 (Dec) | CISA OT AI integration principles↗🏛️ government★★★★☆CISACISA OT AI integration principlesRelevant to AI safety practitioners concerned with deployment risks in high-stakes physical systems; CISA's OT focus makes this distinct from typical software AI governance guidance, addressing scenarios where AI misalignment or failure could directly harm physical infrastructure.This CISA guidance document outlines principles for safely and securely integrating artificial intelligence into operational technology (OT) environments such as industrial cont...governancepolicydeploymenttechnical-safety+4Source ↗ released | Joint international guidance for AI in operational technology |
Mitigations
Technical Defenses
| Intervention | Mechanism | Effectiveness | Status |
|---|---|---|---|
| AI-powered security operations | Anomaly detection, automated response | High | Widely deployed; $1.2M savings per breach |
| Proactive AI vulnerability discovery | Find and patch before attackers | High | OpenAI Aardvark, Zero Day Quest |
| Autonomous defense systems | Real-time response at machine speed | Promising | Early development; CSET notes gap↗🔗 web★★★★☆CSET GeorgetownAnticipating AI's ImpactA 2025 CSET policy-analytical report relevant to AI safety practitioners concerned with dual-use AI capabilities, cyber threat landscapes, and governance of AI in national security contexts.This CSET report by Andrew Lohn (May 2025) analyzes how AI will reshape the cybersecurity offense-defense balance across five domains: digital ecosystem changes, environment har...cybersecuritycapabilitiespolicygovernance+4Source ↗ |
| AI guardrails and jailbreak resistance | Prevent misuse of AI for attacks | Moderate | Circumvented in September 2025 attack |
| Shadow AI governance | Control unauthorized AI tool usage | Low-Moderate | 63% lack formal policies |
Key finding: According to IBM↗🔗 webIBM's 2025 Cost of a Data Breach ReportTangentially relevant to AI safety in the context of AI-enabled cyberattacks, critical infrastructure protection, and the governance of AI systems that handle sensitive data; primarily a cybersecurity industry benchmarking report.IBM's annual report quantifies the financial and operational costs of data breaches globally, drawing on real-world incidents across industries. It provides benchmarks for breac...cybersecuritycritical-infrastructuregovernancepolicy+3Source ↗, organizations using AI and automation extensively throughout security operations saved $1.9 million in breach costs and reduced breach lifecycle by 80 days on average.
Governance Approaches
International agreements: The Council of Europe Framework Convention on AI↗🔗 webfirst binding international AI treatyPublished by the R Street Institute, a free-market policy think tank; relevant for researchers tracking international AI governance frameworks and how cybersecurity concerns are being embedded into binding multilateral AI agreements.This R Street Institute commentary analyzes the cybersecurity provisions of the Council of Europe's Framework Convention on Artificial Intelligence, the first legally binding in...governancepolicycoordinationdeployment+4Source ↗ (signed September 2024) is the first binding international AI treaty. However, enforcement mechanisms remain weak, and major cyber powers (China, Russia) are not signatories.
National frameworks:
- CISA Roadmap for AI↗🏛️ government★★★★☆CISACISA Roadmap for AI (2023-2024)Official U.S. government strategic document from CISA outlining how the nation's cyber defense agency plans to manage AI risks and opportunities for critical infrastructure; now archived but relevant as a reference for U.S. federal AI governance approaches.CISA's whole-of-agency strategic plan for addressing AI in the context of national cybersecurity and critical infrastructure protection. It outlines efforts to promote beneficia...governancepolicyai-safetydeployment+4Source ↗: Whole-of-agency plan for AI security
- CISA AI data security guidance↗🔗 webAI data security guidanceThis CISA guidance is a U.S. federal policy document relevant to practitioners deploying AI in high-stakes or critical infrastructure settings, complementing broader AI governance frameworks with a cybersecurity-focused lens.This resource covers guidance released by the Cybersecurity and Infrastructure Security Agency (CISA) on securing data used in AI systems. It addresses best practices for protec...governancepolicydeploymentai-safety+3Source ↗ (May 2025): Best practices for AI system operators
- DHS AI-CI safety guidelines↗🏛️ governmentAI-CI safety guidelinesAn official U.S. government document issued under EO 14110; useful for understanding how federal agencies are translating AI safety principles into actionable guidelines for high-stakes infrastructure operators.The Department of Homeland Security's April 2024 guidelines provide a four-part framework (Govern, Map, Measure, Manage) to help critical infrastructure owners and operators man...ai-safetygovernancepolicydeployment+6Source ↗ (April 2024): Critical infrastructure protection
Responsible disclosure: Norms for AI-discovered vulnerabilities remain underdeveloped. OpenAI did not publicly release the University of Illinois exploit agent at their request, but the underlying capabilities are widely reproducible.
Defensive Investment Priority
Researchers warn that "exploits at machine speed demand defense at machine speed." The Georgetown CSET report↗🔗 web★★★★☆CSET GeorgetownGeorgetown CSET reportA CSET policy brief providing structured analysis of AI's dual-use implications for cybersecurity, relevant to AI governance discussions around offensive AI capabilities, deployment standards, and national security policy.This CSET policy brief by Andrew Lohn analyzes how varying levels of AI advancement may shift the balance between cyber offense and defense across five categories: digital ecosy...cybersecuritygovernancepolicycapabilities+5Source ↗ emphasizes that the current paradigm has "largely neglected holistic autonomous cyber defense systems."
The generative AI in cybersecurity market is expected to grow almost tenfold between 2024 and 2034, with investment flowing to both offensive and defensive applications.
Sources & Resources
Primary Research
- Anthropic (November 2025): Disrupting the first reported AI-orchestrated cyber espionage campaign↗🔗 web★★★★☆Anthropicfirst documented AI-orchestrated cyberattackA landmark real-world incident report from Anthropic documenting the first known AI-orchestrated espionage campaign, directly relevant to agentic AI risks, deployment safety, and the intersection of AI capabilities with national security threats.Anthropic reports detecting a sophisticated September 2025 espionage campaign in which a suspected Chinese state-sponsored group weaponized Claude Code as an autonomous agent to...cybersecuritycapabilitiesdeploymentred-teaming+6Source ↗ - First documented AI-autonomous cyberattack
- Georgetown CSET (May 2025): Anticipating AI's Impact on the Cyber Offense-Defense Balance↗🔗 web★★★★☆CSET GeorgetownAnticipating AI's ImpactA 2025 CSET policy-analytical report relevant to AI safety practitioners concerned with dual-use AI capabilities, cyber threat landscapes, and governance of AI in national security contexts.This CSET report by Andrew Lohn (May 2025) analyzes how AI will reshape the cybersecurity offense-defense balance across five domains: digital ecosystem changes, environment har...cybersecuritycapabilitiespolicygovernance+4Source ↗ - Comprehensive academic analysis
- CNAS (September 2025): Tipping the Scales: Emerging AI Capabilities and the Cyber Offense-Defense Balance↗🔗 web★★★★☆CNASTipping the Scales: How Emerging AI Capabilities Could Disrupt the Cyber Offense-Defense BalancePublished by the Center for a New American Security (CNAS), this report is relevant for researchers and policymakers examining dual-use AI risks, particularly how offensive AI capabilities intersect with national security and critical infrastructure protection.This CNAS report examines how advancing AI capabilities may shift the balance between cyber offense and defense, potentially giving attackers new advantages in exploiting vulner...cybersecuritygovernancepolicycapabilities+4Source ↗
- IST (May 2025): The Implications of Artificial Intelligence in Cybersecurity↗🔗 webImplications of AI in CybersecurityPublished by the Institute for Security and Technology, this report is relevant to AI safety researchers and policymakers concerned with how advancing AI capabilities intersect with cybersecurity threats, particularly around critical infrastructure and dual-use risks.This report from the Institute for Security and Technology examines how AI is transforming the cybersecurity landscape, analyzing both offensive and defensive applications. It e...cybersecuritygovernancepolicycapabilities+6Source ↗
- University of Illinois (2024): AI agents exploit 87% of one-day vulnerabilities↗🔗 webResearch from the University of IllinoisKey empirical evidence that frontier LLMs lower the barrier to cyberattacks; relevant to AI risk assessments, deployment policy debates, and discussions of capability thresholds for dangerous use.This IBM Think article summarizes University of Illinois research demonstrating that GPT-4 can autonomously exploit 87% of 'one-day' (recently disclosed but unpatched) cybersecu...capabilitiesred-teamingcybersecuritydeployment+4Source ↗
Industry Reports
- IBM (2025): Cost of a Data Breach Report 2025↗🔗 webIBM's 2025 Cost of a Data Breach ReportTangentially relevant to AI safety in the context of AI-enabled cyberattacks, critical infrastructure protection, and the governance of AI systems that handle sensitive data; primarily a cybersecurity industry benchmarking report.IBM's annual report quantifies the financial and operational costs of data breaches globally, drawing on real-world incidents across industries. It provides benchmarks for breac...cybersecuritycritical-infrastructuregovernancepolicy+3Source ↗
- Microsoft (2025): Digital Defense Report 2025↗🔗 web★★★★☆MicrosoftMicrosoft Digital Defense Report 2025This annual industry report from Microsoft is relevant to AI safety discussions around dual-use AI capabilities, the offensive use of AI by threat actors, and governance challenges in securing AI-enabled infrastructure at scale.Microsoft's 2025 Digital Defense Report analyzes the current cyber threat landscape, highlighting how AI is accelerating both offensive and defensive capabilities. It documents ...cybersecuritycapabilitiesgovernancepolicy+4Source ↗
- Cybersecurity Ventures (2025): Cybersecurity Almanac 2025↗🔗 webCybersecurity Ventures projectsTangentially relevant to AI safety as AI systems increasingly intersect with cybersecurity; useful for understanding the threat environment in which AI-enabled or AI-targeted attacks may occur, but not directly focused on AI alignment or safety.The Cybersecurity Almanac 2025 by Cybersecurity Ventures compiles key statistics, forecasts, and trends in global cybersecurity, including projections on cybercrime costs, workf...cybersecuritygovernancecritical-infrastructurepolicy+3Source ↗
Government Guidance
- CISA: Roadmap for AI↗🏛️ government★★★★☆CISACISA Roadmap for AI (2023-2024)Official U.S. government strategic document from CISA outlining how the nation's cyber defense agency plans to manage AI risks and opportunities for critical infrastructure; now archived but relevant as a reference for U.S. federal AI governance approaches.CISA's whole-of-agency strategic plan for addressing AI in the context of national cybersecurity and critical infrastructure protection. It outlines efforts to promote beneficia...governancepolicyai-safetydeployment+4Source ↗
- CISA (May 2025): AI Data Security Guidance↗🔗 webAI data security guidanceThis CISA guidance is a U.S. federal policy document relevant to practitioners deploying AI in high-stakes or critical infrastructure settings, complementing broader AI governance frameworks with a cybersecurity-focused lens.This resource covers guidance released by the Cybersecurity and Infrastructure Security Agency (CISA) on securing data used in AI systems. It addresses best practices for protec...governancepolicydeploymentai-safety+3Source ↗
- DHS (April 2024): AI-CI Safety and Security Guidelines↗🏛️ governmentAI-CI safety guidelinesAn official U.S. government document issued under EO 14110; useful for understanding how federal agencies are translating AI safety principles into actionable guidelines for high-stakes infrastructure operators.The Department of Homeland Security's April 2024 guidelines provide a four-part framework (Govern, Map, Measure, Manage) to help critical infrastructure owners and operators man...ai-safetygovernancepolicydeployment+6Source ↗
- CISA (December 2025): Principles for Secure AI Integration in OT↗🏛️ government★★★★☆CISACISA OT AI integration principlesRelevant to AI safety practitioners concerned with deployment risks in high-stakes physical systems; CISA's OT focus makes this distinct from typical software AI governance guidance, addressing scenarios where AI misalignment or failure could directly harm physical infrastructure.This CISA guidance document outlines principles for safely and securely integrating artificial intelligence into operational technology (OT) environments such as industrial cont...governancepolicydeploymenttechnical-safety+4Source ↗
International Governance
- Council of Europe (2024): Framework Convention on AI and Human Rights↗🔗 webfirst binding international AI treatyPublished by the R Street Institute, a free-market policy think tank; relevant for researchers tracking international AI governance frameworks and how cybersecurity concerns are being embedded into binding multilateral AI agreements.This R Street Institute commentary analyzes the cybersecurity provisions of the Council of Europe's Framework Convention on Artificial Intelligence, the first legally binding in...governancepolicycoordinationdeployment+4Source ↗
- Paris Peace Forum (2025): Forging Global Cooperation on AI Risks: Cyber Policy as a Governance Blueprint↗🔗 webParis Call for Trust and Security in CyberspacePublished by the Paris Peace Forum in early 2025, this report is relevant for AI governance researchers exploring how precedents from cybersecurity multilateralism might accelerate the development of international AI safety institutions and norms.This Paris Peace Forum report examines how existing cybersecurity governance frameworks, particularly the Paris Call for Trust and Security in Cyberspace, can serve as a bluepri...governancepolicycoordinationai-safety+4Source ↗
Video & Podcast Resources
- Lex Fridman #266: Nicole Perlroth↗🔗 webLex Fridman #266: Nicole PerlrothTangentially relevant to AI safety as a case study in how dual-use technologies and offensive capabilities can proliferate without governance frameworks — parallels to AI weapons and autonomous systems concerns.Lex Fridman interviews Nicole Perlroth, cybersecurity journalist and author of 'This Is How They Tell Me the World Ends,' covering the global cyberweapons arms race, vulnerabili...governancepolicyexistential-riskcybersecurity+4Source ↗ - Cybersecurity journalist on cyber warfare
- Darknet Diaries Podcast↗🔗 webDarknet Diaries: Voice Phishing EpisodesTangentially relevant to AI safety via voice cloning and deepfake-enabled social engineering episodes; useful for understanding real-world misuse of AI audio synthesis tools in fraud and manipulation contexts.Darknet Diaries is a podcast covering true stories from the dark side of the internet, including hacking, data breaches, cybercrime, and social engineering. Episodes relevant to...cybersecuritysocial-engineeringvoice-cloningdeployment+4Source ↗ - True stories from the dark side of the internet
- CISA Cybersecurity Videos↗🏛️ government★★★★☆CISACISA Cybersecurity VideosOfficial U.S. government resource hub; relevant for AI governance and deployment safety research, particularly for those studying regulatory frameworks, critical infrastructure protection, and government-led AI security standards.CISA's central hub aggregates U.S. government and international partner guidance on AI security across the full lifecycle, covering secure development, deployment, data security...ai-safetygovernancepolicyred-teaming+6Source ↗ - Official government guidance
Analytical Models
References
This Paris Peace Forum report examines how existing cybersecurity governance frameworks, particularly the Paris Call for Trust and Security in Cyberspace, can serve as a blueprint for developing international AI risk governance. It analyzes structural parallels between cyber and AI governance challenges and proposes lessons from cybersecurity diplomacy for building cooperative AI safety regimes.
This resource reports that approximately 70% of cyberattacks in 2024 were directed at critical infrastructure sectors, highlighting the growing threat landscape for essential services. It aggregates notable attack incidents and trends from 2024, underscoring vulnerabilities in energy, water, healthcare, and transportation systems. The piece contextualizes why critical infrastructure is an attractive target and discusses implications for national security.
This CISA guidance document outlines principles for safely and securely integrating artificial intelligence into operational technology (OT) environments such as industrial control systems and critical infrastructure. It addresses unique risks posed by AI in high-stakes physical systems where failures can have severe real-world consequences. The document provides a framework for operators and vendors to manage AI-related cybersecurity risks in OT contexts.
CISA's whole-of-agency strategic plan for addressing AI in the context of national cybersecurity and critical infrastructure protection. It outlines efforts to promote beneficial AI uses for cybersecurity, protect AI systems from cyber threats, and deter malicious use of AI against critical infrastructure. The document represents the U.S. cyber defense agency's formal alignment with national AI strategy.
This CNAS report examines how advancing AI capabilities may shift the balance between cyber offense and defense, potentially giving attackers new advantages in exploiting vulnerabilities, automating attacks, and evading defenses. It analyzes the implications for national security, critical infrastructure, and existing cybersecurity frameworks. The report offers policy recommendations for governments and organizations to prepare for an AI-enabled cyber threat landscape.
CDK Global, a software provider serving thousands of US auto dealerships, reportedly paid approximately $25 million in bitcoin to BlackSuit ransomware hackers following a June 2024 cyberattack. Blockchain data shows ~387 bitcoin was transferred to a hacker-controlled account on June 21, 2024, after which CDK began restoring dealer access. The incident illustrates how ransomware attacks on critical software infrastructure can cascade across entire industries.
A security researcher demonstrates using GPT-4 to autonomously generate a functional exploit for a critical Erlang/OTP SSH vulnerability (CVE-2025-32433) before any public proof-of-concept code was released. The AI identified the vulnerable commit, diffed patched vs. unpatched code, located the flaw, and iteratively debugged a working exploit. This serves as a concrete real-world example of AI-assisted offensive security research.
Microsoft's 2025 Digital Defense Report analyzes the current cyber threat landscape, highlighting how AI is accelerating both offensive and defensive capabilities. It documents the industrialization of cybercrime, the 87% rise in destructive cloud attacks, and the increasing role of nation-state actors, while calling for innovation, resilience, and cross-sector collaboration as defensive priorities.
Microsoft announces the next evolution of Security Copilot, introducing AI agents designed to autonomously handle critical security tasks such as phishing triage, data security, and identity management. The announcement highlights the growing necessity of AI-driven defenses as cyberattack volume and complexity exceed human response capacity, with over 30 billion phishing emails detected in 2024 alone. The platform integrates with Microsoft Defender, Entra, and Purview to provide end-to-end AI-first security.
In June 2024, the BlackSuit ransomware group attacked CDK Global, a software provider serving over 15,000 North American auto dealerships, encrypting critical systems and demanding over $50 million in ransom. The attack disrupted dealer management systems, sales, financing, and inventory tracking across major dealership chains for days. The incident illustrates cascading supply-chain risks when a critical third-party SaaS provider is compromised.
A statistics-focused overview of AI-powered cyberattack trends, reporting a 72% year-over-year surge in AI-enabled attacks. The article covers deepfakes, ransomware escalation, and the growing use of AI by malicious actors, drawing on aggregated industry data through 2025.
CISA's central hub aggregates U.S. government and international partner guidance on AI security across the full lifecycle, covering secure development, deployment, data security, and red teaming. It emphasizes that AI systems must be built with security as a foundational principle and provides practical resources for critical infrastructure operators, AI developers, and adopters. Publications are joint-seal documents co-authored with NSA, DHS, and international cybersecurity agencies.
Anthropic reports detecting a sophisticated September 2025 espionage campaign in which a suspected Chinese state-sponsored group weaponized Claude Code as an autonomous agent to attack roughly thirty global targets including tech companies, financial institutions, and government agencies. This is described as the first documented large-scale cyberattack executed without substantial human intervention, leveraging AI capabilities in intelligence, agency, and tool use. Anthropic responded by banning accounts, notifying victims, coordinating with authorities, and expanding detection capabilities.
This CSET policy brief by Andrew Lohn analyzes how varying levels of AI advancement may shift the balance between cyber offense and defense across five categories: digital ecosystem changes, environment hardening, tactical engagements, incentives, and strategic effects. It concludes there is no single answer but identifies predictable and potentially controllable trends, offering concrete policy recommendations to preserve defensive advantages.
A data-rich 2025 compilation of statistics on AI-enabled cyberattacks, covering attack trends, breach costs, exposed AI infrastructure, and defensive playbooks. It synthesizes data from Verizon DBIR, IBM, Microsoft, FBI IC3, and other sources to quantify how generative AI is transforming both offensive and defensive cybersecurity landscapes.
This IBM Think article summarizes University of Illinois research demonstrating that GPT-4 can autonomously exploit 87% of 'one-day' (recently disclosed but unpatched) cybersecurity vulnerabilities when given CVE descriptions. The finding highlights the dual-use risk of advanced LLMs as tools for automated cyberattacks, requiring only publicly available vulnerability information to achieve high exploitation rates.
OpenAI introduced Aardvark, an autonomous AI security research agent powered by GPT-5 that continuously analyzes codebases to discover vulnerabilities, validate exploitability in sandboxed environments, and propose patches. Unlike traditional static analysis tools, it uses LLM-powered reasoning to read and understand code as a human security researcher would. It was later rebranded as Codex Security in March 2026.
The Department of Homeland Security's April 2024 guidelines provide a four-part framework (Govern, Map, Measure, Manage) to help critical infrastructure owners and operators manage AI-related risks. The document identifies three cross-sector risk categories—attacks using AI, attacks on AI systems, and AI design/implementation failures—and maps mitigation strategies to the NIST AI Risk Management Framework. It represents a practical, sector-agnostic approach to operationalizing AI risk management in high-stakes infrastructure contexts.
The Cybersecurity Almanac 2025 by Cybersecurity Ventures compiles key statistics, forecasts, and trends in global cybersecurity, including projections on cybercrime costs, workforce gaps, and threat landscape evolution. It serves as a comprehensive reference document for understanding the scale and trajectory of cyber threats facing organizations and critical infrastructure. The almanac is widely cited in industry and policy discussions around cybersecurity investment and risk.
This report from the Institute for Security and Technology examines how AI is transforming the cybersecurity landscape, analyzing both offensive and defensive applications. It explores how AI enables more sophisticated cyberattacks while also offering new capabilities for threat detection and response. The report provides policy recommendations for governments and organizations navigating AI-driven cyber risks.
Lex Fridman interviews Nicole Perlroth, cybersecurity journalist and author of 'This Is How They Tell Me the World Ends,' covering the global cyberweapons arms race, vulnerabilities in critical infrastructure, and the dangers of zero-day exploits. The conversation explores how nation-states develop and deploy offensive cyber capabilities and the systemic risks this poses to society.
Research demonstrates that AI systems, particularly large language models, can autonomously generate functional exploit code for known CVE vulnerabilities in as little as 10-15 minutes. This capability significantly lowers the barrier for cyberattacks by enabling even low-skilled actors to rapidly weaponize disclosed vulnerabilities. The findings raise urgent concerns about the accelerating timeline between vulnerability disclosure and active exploitation.
This resource covers guidance released by the Cybersecurity and Infrastructure Security Agency (CISA) on securing data used in AI systems. It addresses best practices for protecting AI training data, model outputs, and infrastructure from adversarial threats and unauthorized access. The guidance is aimed at organizations deploying AI in critical infrastructure contexts.
This resource covers the Anderson Economic Group's analysis estimating that the 2024 CDK Global cyberattack caused approximately $1 billion in losses to automotive dealerships. The attack disrupted dealer management software used across thousands of US car dealerships, illustrating the cascading economic impact of ransomware attacks on critical commercial infrastructure.
A statistical resource compiling data on AI-enabled cyberattacks, covering the frequency, scale, and impact of AI-assisted malicious activities. It provides quantitative insights into how AI is being weaponized for cybersecurity threats including automated attacks, phishing, and infrastructure targeting.
Darknet Diaries is a podcast covering true stories from the dark side of the internet, including hacking, data breaches, cybercrime, and social engineering. Episodes relevant to AI safety include voice phishing (vishing) attacks, which increasingly leverage voice cloning and deepfake audio to manipulate targets. The podcast provides real-world case studies illustrating how social engineering and emerging AI-enabled deception techniques are deployed against individuals and organizations.
This CSET report by Andrew Lohn (May 2025) analyzes how AI will reshape the cybersecurity offense-defense balance across five domains: digital ecosystem changes, environment hardening, tactical engagements, incentives, and strategic effects. It finds no single winner—AI aids both attackers and defenders—but identifies concrete steps defenders can take to tilt the balance in their favor. The report warns that several missteps could push the balance toward offense.
This R Street Institute commentary analyzes the cybersecurity provisions of the Council of Europe's Framework Convention on Artificial Intelligence, the first legally binding international AI treaty. It highlights five specific measures addressing AI-related cybersecurity risks across critical infrastructure and information systems. The piece evaluates how the treaty's provisions could shape international norms for securing AI-enabled systems.
IBM's annual report quantifies the financial and operational costs of data breaches globally, drawing on real-world incidents across industries. It provides benchmarks for breach detection, response times, and cost drivers including AI adoption, cloud vulnerabilities, and regulatory impacts. The report serves as a key industry reference for cybersecurity risk assessment and investment decisions.
This Anthropic report documents the identification and disruption of what is described as the first known cyber espionage campaign orchestrated using AI systems. It analyzes how AI tools were leveraged to conduct sophisticated information-gathering and intrusion operations, and outlines defensive measures and lessons learned for AI safety and security.