Bioweapons

Risk

Bioweapons Risk

Comprehensive synthesis of AI-bioweapons evidence through early 2026, including the FRI expert survey finding 5x risk increase from AI capabilities (0.3% → 1.5% annual epidemic probability), Anthropic's ASL-3 activation for Claude Opus 4, and OpenAI's o3 reaching 94th percentile on virology tests. Key developments: DNA screening now catches 97% of threats post-patch, but open-source models (DeepSeek) lack safeguards. Expert consensus: safeguards can reduce risk nearly to baseline even with advanced AI capabilities.

Wikipedia EA Forum 80,000 Hours Grokipedia

CategoryMisuse Risk

SeverityCatastrophic

Likelihoodmedium

Timeframe2027

MaturityGrowing

TypeMisuse

Key ConcernLowering barriers to development

Risks

Organizations

10.8k words · 32 backlinks

Quick Assessment

Dimension	Assessment	Evidence
Current AI Uplift	Low-Moderate (1.3-2.5x)	RAND 2024: no significant difference; Anthropic 2025: "substantially fewer critical failures" with AI
Expert Risk Estimate	0.3% → 1.5% annual with AI capabilities	FRI survey: 5x increase if AI matches expert virologists
Frontier Model Status	Expert-level knowledge achieved	OpenAI's o3: 94th percentile on VCT; Claude Opus 4 triggered ASL-3
Screening Evasion	75%+ pre-patch; 97% post-patch	Microsoft 2024; patch deployed globally Oct 2025
Open-Source Risk	High concern	DeepSeek "worst tested" for biosafety (Amodei 2025)
Wet Lab Bottleneck	Remains primary barrier	Soviet Biopreparat: 30,000+ staff over decades; Aum Shinrikyo failed
Defense Trajectory	Favored long-term	mRNA platforms, metagenomic surveillance, far-UVC maturing
Policy Readiness	Inadequate	CSIS 2025: measures "ill-equipped" for AI threats

Overview

AI systems could accelerate biological weapons development by helping with pathogen design, synthesis planning, or acquisition of dangerous knowledge. The concern isn't that AI creates entirely new risks, but that it lowers barriers—making capabilities previously requiring rare expertise more accessible to bad actors.

This is considered one of the most severe near-term AI risks because biological weapons can cause mass casualties and AI-assisted bioweapons could be developed by smaller groups than traditional state programs required. Unlike many other AI risks that depend on future, more capable systems, this risk applies to models available today.

The key debate centers on whether AI provides meaningful "uplift"—whether it genuinely helps beyond what's already accessible through scientific literature and internet searches, or whether wet-lab skills remain the true bottleneck. Current evidence is reportedly mixed: a 2024 RAND Corporation study found no statistically significant AI uplift for bioweapon attack planning, while separate Microsoft research indicated that AI-designed toxins evaded more than 75% of SecureDNA tools.

However, 2025 has marked a significant shift in official assessments. OpenAI has stated it expects its next-generation models to reach "high-risk classification" for biological capabilities—meaning they could provide "meaningful counterfactual assistance to novice actors." Anthropic reportedly activated ASL-3 (AI Safety Level 3) protections for Claude Opus 4 specifically due to biological and chemical weapon concerns. The National Academies of Sciences, Engineering, and Medicine's March 2025 report The Age of AI in the Life Sciences found that while current biological design tools cannot yet design self-replicating pathogens, monitoring and mitigation are urgently needed. OpenAI's o3 model has also scored at the 94th percentile on the Virology Capabilities Test (VCT), a benchmark designed to measure dangerous biological knowledge.

Risk Assessment

Dimension	Assessment	Notes
Severity	High to Catastrophic	Biological weapons can cause mass casualties; worst-case scenarios involve engineered pandemics
Likelihood	Uncertain	Current evidence is mixed on AI uplift; capabilities are rapidly improving
Timeline	Near-term	Unlike many AI risks, this concern applies to current systems
Trend	Increasing	Each model generation shows more biological knowledge; screening gaps persist
Window	Temporary	AI may eventually favor defense (surveillance, vaccines, countermeasures); risk elevated during transition period

Responses That Address This Risk

Response	Mechanism	Effectiveness
Biosecurity Interventions	DNA screening, surveillance, countermeasures, physical defenses	High (portfolio)
Responsible Scaling Policies	Internal biosecurity evaluations before deployment	Medium
Compute Governance	Limits access to training resources for dangerous models	Medium
US AI Chip Export Controls	Restricts AI chip exports to adversary nations	Low-Medium
AI Safety Institutes (AISIs)	Government evaluation of biosecurity risks	Medium
Voluntary AI Safety Commitments	Lab pledges on dangerous capability evaluation	Low

The Total Risk Debate

How dangerous is AI-assisted bioweapons development? Expert assessments vary substantially, from those who consider it an imminent catastrophic threat to those who view it as overhyped. Understanding both sides of this debate—and the key uncertainties that drive disagreement—is essential for calibrating policy responses.

Estimating Overall Risk

Attempting to quantify the total risk from AI-assisted bioweapons requires estimating both the probability of an attack and its potential consequences. Estimates vary widely:

Estimate Type	Range	Source/Basis	Key Assumptions
Annual probability of catastrophic AI-assisted bio attack	0.01% - 0.5%	Expert elicitation, attack chain analysis	"Catastrophic" = 10,000+ casualties
Cumulative probability through 2040	0.1% - 8%	Timeline projections	Depends heavily on AI capability trajectory
Expected casualties if attack occurs	10,000 - 10M+	Historical/scenario analysis	Varies by pathogen, deployment method, response
Expected value of harm per year	$1B - $500B	Probability × consequence estimates	Extremely uncertain

Expert Survey Data (2025)

The Forecasting Research Institute surveyed 46 biosecurity experts and 22 superforecasters in early 2025 on AI-enabled biorisk. Their findings are summarized below:

Scenario	Annual Risk of 100K+ Death Epidemic	Multiplier
Baseline (no AI capability increase)	0.3%	1x
AI matches expert virologists on troubleshooting	1.5%	5x
AI enables 50% of non-experts to synthesize influenza	1.25%	4.2x
With mandatory screening + jailbreaking safeguards	0.4%	1.3x

Key insight: According to the FRI survey, safeguards (closed weights, anti-jailbreaking, DNA screening) can reduce risk nearly to baseline even with advanced AI capabilities.

The Bioweapons Attack Chain Model estimates compound attack probability at 0.02%–3.6% depending on assumptions, with substantial uncertainty at each step. The wide range reflects genuine disagreement about key parameters.

Existential risk context: In The Precipice (2020), Oxford philosopher Toby Ord estimates the chance of existential catastrophe from engineered pandemics at 1 in 30 by 2100—which he identifies as second only to AI among anthropogenic risks. Ord writes that it "now seems within the reach of near-term biological advances to create pandemics that would kill greater than 50% of the population—not just in a particular area, but globally." While not all engineered pandemics would be AI-assisted, this frames the potential severity of the threat.

Industry concerns: In July 2023, Anthropic CEO Dario Amodei stated that within two to three years, there was a "substantial risk" that AI tools would "greatly widen the range of actors with the technical capability to conduct a large-scale biological attack." The Center for a New American Security (CNAS) report on AI and bioweapons notes this could "expose the United States to catastrophic threats far exceeding the impact of COVID-19."

Arguments for High Risk

Those who consider AI-bioweapons a severe threat emphasize several points:

1. Democratization of Dangerous Knowledge

AI makes dangerous biological knowledge more accessible to those who couldn't previously obtain it. While scientific literature contains detailed protocols, navigating it requires expertise. AI systems can synthesize, explain, and contextualize this information for non-experts, potentially expanding the pool of capable actors.

The equalizer effect: The most concerning scenario isn't AI helping expert virologists (who already have the knowledge), but AI helping moderately skilled individuals bridge knowledge gaps that previously required years of training or team collaboration.

2. Asymmetric Evasion Capabilities

According to reporting on Microsoft's 2024 research, AI-designed toxins reportedly evaded a substantial proportion of commercial DNA synthesis screening tools. This is qualitatively different from knowledge provision—it represents AI helping attackers circumvent existing defenses.

DNA synthesis screening is a cornerstone of current biosecurity. If AI can reliably design functional variants that evade detection, the entire screening paradigm may become obsolete faster than new defenses can be developed. This creates an asymmetric threat where even modest AI capabilities could undermine years of defensive investment.

3. Rapid Capability Improvement

AI capabilities are improving rapidly. Even if current models provide limited uplift, the trend is concerning:

Capability	GPT-4 (2023)	Claude 3.5/GPT-4o (2024)	Claude Opus 4/o3 (2025)	Trend
Biology knowledge	High	Very High	Expert-level	Rapidly increasing
Synthesis planning	Moderate	Moderate-High	High	Increasing
Evading guardrails	Moderate	Low-Moderate	Low (frontier models)	Variable by model
Integration with tools	Limited	Growing	Substantial	Accelerating

2025 milestone: OpenAI's April 2025 o3 model reportedly ranked in the 94th percentile among expert human virologists on virology capability evaluations, marking the first time a frontier AI model has demonstrated expert-level performance on biological troubleshooting scenarios.

The argument is that we should prepare for future capabilities, not just current ones. By the time AI demonstrably provides high uplift, it may be too late to establish governance.

4. Combination with Other Technologies

AI alone may provide limited uplift, but the combination of multiple technologies could be transformative:

Diagram (loading…)

flowchart TD
  LLM[Large Language Models] --> COMBO[Compound Capability]
  PROTEIN[Protein Design AI] --> COMBO
  LAB[Lab Automation] --> COMBO
  SYNTH[Cheap DNA Synthesis] --> COMBO
  COMBO --> THREAT[Enhanced Threat]

  style COMBO fill:#ffddcc
  style THREAT fill:#ffcccc

LLMs + protein design tools: Tools such as AlphaFold, which DeepMind released publicly in 2021, enable novel protein structure prediction and engineering; LLMs help identify targets and plan experimental applications.
AI + lab automation: Automated systems could eventually execute protocols with minimal human intervention
AI + decreasing synthesis costs: DNA synthesis costs have fallen dramatically over the past two decades; AI could help design sequences optimized for synthesis on cheaper platforms.

Each technology alone may be manageable, but their combination could create emergent risks that exceed any individual contribution.

5. Tail Risk Considerations

Even if the median expectation is manageable, the worst-case scenarios are severe enough to warrant serious attention:

Engineered pandemic: A pathogen designed for transmissibility, lethality, and immune evasion could potentially cause millions of deaths
Multiple simultaneous attacks: AI could enable coordination of attacks across multiple locations
Degradation of trust in biology: Widespread bioterrorism could undermine beneficial biological research and public health

From a risk management perspective, low-probability/high-consequence events may deserve more weight than their expected value alone suggests.

6. Historical Underestimation

History suggests we systematically underestimate technology-enabled threats. The first nuclear device was tested in July 1945—less than a decade after the discovery of fission in 1938, a pace faster than many contemporary physicists anticipated. COVID-19 demonstrated how disruptive a novel pathogen can be, causing millions of deaths and trillions of dollars in economic damage within months of its emergence. AI capabilities have also repeatedly exceeded near-term forecasts.

Skepticism about AI-bioweapons risk may itself be the risky position.

7. The "De-skilling" Trajectory

Multiple emerging technologies are simultaneously reducing the skill requirements for biological research:

Cloud laboratories automate complex procedures and allow remote execution
Benchtop DNA synthesizers are approaching gene-length capabilities
AI assistants bridge knowledge gaps and provide troubleshooting guidance
Protocol automation reduces the need for tacit laboratory knowledge

Each of these alone might be manageable, but together they suggest a trajectory toward dramatically lowered barriers. Any current empirical study may capture a snapshot where these technologies haven't yet converged—but convergence appears plausible within the decade.

8. Offense Has Asymmetric Advantages

Biological attacks have inherent asymmetric characteristics that favor attackers:

Attribution lag: Days to weeks may pass before an attack is recognized as intentional
Preparation asymmetry: Attackers can prepare countermeasures for themselves; defenders must protect everyone
Innovation asymmetry: Attackers need to succeed once; defenders must anticipate all possible attack vectors
Psychological impact: Even unsuccessful or small-scale attacks could cause massive economic and social disruption

AI amplifies these asymmetries by potentially enabling novel attack vectors that existing defenses haven't anticipated.

9. Open-Source Model Proliferation

Even if frontier labs implement strong biosecurity measures, the proliferation of open-source models undermines containment:

No centralized control: Once weights are released, restrictions cannot be enforced
Fine-tuning vulnerability: Safety training can be removed with relatively modest compute
Capability improvements: Open models are approaching frontier capabilities with roughly 6–12 month lags
Global availability: Actors in any jurisdiction can access open models

The CNAS report↗ recommends considering a "licensing regime for biological design tools with potentially catastrophic capabilities"—but this has not been implemented as of 2025.

The DeepSeek warning: In February 2025, Anthropic CEO Dario Amodei reportedly stated that testing of China's DeepSeek model revealed it was among the worst performers on biosecurity of any model evaluated—generating information relevant to producing bioweapons "that can't be found on Google or can't be easily found in textbooks" with "absolutely no blocks whatsoever." While Amodei did not characterize DeepSeek as "literally dangerous" at that time, the incident highlighted how open-source models from different jurisdictions may not implement equivalent safety measures.

Arguments for Lower Risk

Those who consider AI-bioweapons risk overstated emphasize different considerations:

1. The RAND Study: No Significant Uplift

A 2024 RAND Corporation study is among the more rigorous empirical assessments of AI uplift conducted to date. According to reporting on the study, twelve teams of three researchers each spent 80 hours developing bioweapon attack plans—half using AI assistance, half using only open internet resources. Expert evaluators reportedly found no statistically significant difference in plan viability between the two groups.

This finding directly challenges claims that AI meaningfully assists biological attacks. If AI-assisted and non-AI teams perform equivalently, the AI "threat" may be more limited than feared.

Group	Information Quality	Plan Viability	Novelty	Statistical Significance
AI-assisted	High	Moderate	Low	n/a
Internet-only	High	Moderate	Low	n/a
Difference	Minimal	Minimal	None	Not significant

Implications: Dangerous biological information is already widely accessible through legitimate scientific literature. AI may be redundant with existing sources rather than providing novel dangerous capabilities.

2. Wet Lab Bottleneck

Knowledge is not capability. Even with complete theoretical understanding, executing biological synthesis requires:

Tacit knowledge that transfers poorly through text (how to handle contamination, optimize growth conditions, troubleshoot failures)
Specialized equipment that is expensive, regulated, and hard to obtain
Months of practice to develop reliable technique
Physical safety procedures that untrained individuals typically violate

The Soviet Union's Biopreparat program, established in the 1970s, reportedly employed tens of thousands of scientists and technicians over decades in a state-directed effort to develop reliable bioweapons—a scale of human expertise that underscores the difficulty of the task. Aum Shinrikyo, despite substantial financial resources and personnel with scientific training, failed repeatedly in their bioweapons attempts throughout the 1990s. The capability bottleneck may be far more important than the knowledge bottleneck.

AI cannot transfer tacit knowledge. Reading about sterile technique is different from maintaining it reliably under pressure. AI can explain protocols but cannot teach hands-on laboratory skills.

3. Guardrails and Filtering Work

Frontier AI models include safety measures that reduce dangerous information provision:

Refusals for explicitly harmful requests
Content filtering
Constitutional AI and RLHF training
Continuous red-teaming and patching

While not perfect, these measures raise barriers. Jailbreaking techniques exist but require effort and sophistication, and often produce degraded responses. The marginal attacker may be more likely to use open internet resources than to navigate AI guardrails.

4. Existing Information Abundance

Scientific literature already contains dangerous information. Textbooks explain pathogen biology in detail. The internet hosts synthesis protocols. The marginal information contribution of AI may be minimal when the baseline is that much of this information is already accessible. AI's value proposition—synthesis and accessibility—matters less if motivated individuals were already able to find information through traditional means.

5. Defense Advantages

AI capabilities benefit defense as much as offense, and defensive applications may be more scalable:

Application	Offense Contribution	Defense Contribution	Net Balance
Pathogen detection	Marginal	Substantial	Defense
Vaccine development	Marginal	Transformative	Strong defense
Synthesis planning	Moderate	Minimal	Offense
Countermeasure design	Marginal	Substantial	Defense
Surveillance	None	Substantial	Strong defense
Treatment optimization	None	Substantial	Strong defense

Metagenomic surveillance, mRNA vaccine platforms, and AI-assisted drug discovery are advancing rapidly. These defensive technologies may ultimately make biological attacks less effective rather than more dangerous.

The transition period concern: Even those who believe defense wins long-term often worry about a near-term window where offense temporarily gains advantages before defenses mature.

6. Deterrence and Attribution

Biological attacks, especially sophisticated ones, leave traces that can enable attribution:

Genomic sequencing of pathogens
Epidemiological tracking
Intelligence on precursor purchases
Surveillance of likely actors

State actors face retaliation risks. Non-state actors face intense investigative focus. The certainty of attribution for significant attacks provides a deterrent effect that pure capability analysis misses.

7. Historical Non-Occurrence

Despite decades of accessible biological knowledge and multiple motivated actors, catastrophic bioterrorism has not occurred. This may indicate genuine difficulty—or it may reflect luck that could change as AI lowers barriers.

The Key Cruxes

Much of the disagreement about AI-bioweapons risk reduces to a small number of factual questions where reasonable people disagree:

Crux 1: Does AI Provide Meaningful Uplift?

If uplift is low (less than 1.5x): Focus resources on traditional biosecurity rather than AI-specific interventions. The threat is real but not qualitatively changed by AI.

If uplift is high (greater than 2x): Urgent need for AI-specific guardrails, compute governance, and model restrictions. The threat landscape has fundamentally shifted.

Evidence	Favors Low Uplift	Favors High Uplift
RAND study	Strong	—
Screening evasion research	—	Strong
Model capability trends	—	Moderate
Expert elicitation	Mixed	Mixed
Current assessment	Favored (65%)	35%

Crux 2: Is the Knowledge Bottleneck or Capability Bottleneck More Important?

If knowledge is the bottleneck: AI providing information is directly dangerous.

If capability is the bottleneck: AI providing information is mostly redundant with existing sources; wet lab skills remain rate-limiting.

Evidence	Favors Knowledge Bottleneck	Favors Capability Bottleneck
Historical bioterrorism failures	—	Strong
State program difficulty	—	Strong
Information abundance online	—	Moderate
AI capability trends	Moderate	—
Current assessment	35%	Favored (65%)

Crux 3: Will Defense or Offense Win Long-Term?

If defense wins: AI-bioweapons is a transitional problem that self-corrects as defensive applications mature.

If offense wins: AI permanently shifts the advantage to attackers, requiring sustained containment efforts.

If it's a window: The near-term favors offense, but defense catches up—the question is whether catastrophic attacks occur during the transition.

Scenario	Probability	Implications
Permanent offense advantage	15%	Maximum concern; sustained containment needed
Permanent defense advantage	40%	Eventually self-correcting; manage transition
Temporary window (5-10 years)	35%	Near-term urgency, medium-term resolution
Unclear/context-dependent	10%	Need robust strategies for multiple scenarios

Crux 4: How Quickly Are Capabilities Advancing?

If capabilities are saturating: Current systems represent near-peak dangerous capabilities; governance can catch up.

If capabilities continue scaling: Future systems will be substantially more dangerous; governance is racing against time.

The AI-Bioweapons Timeline Model projects capability thresholds, with synthesis assistance potentially arriving 2027-2032 and novel agent design 2030-2040.

Crux 5: How Effective Are Guardrails and Countermeasures?

If guardrails work well: The marginal risk from AI models is small; responsible development practices suffice.

If guardrails fail: Open-source proliferation and jailbreaking make model-level interventions largely ineffective.

Factor	Favors Guardrails	Favors Guardrail Failure
Frontier model safety measures	Moderate	—
Open-source model proliferation	—	Strong
Jailbreaking research	—	Moderate
Fine-tuning vulnerability	—	Moderate
Current assessment	Partially effective (40%)	Limited effectiveness (60%)

The open-source challenge: Even if frontier labs implement strong safeguards, open-source models may not. As capable open models proliferate, guardrails become optional, fine-tuning can remove remaining restrictions, and dangerous capabilities become permanently accessible.

Crux 6: Can DNA Synthesis Screening Keep Pace?

DNA synthesis screening is the primary defense against engineered pathogens, but Microsoft's research revealed significant gaps.

If screening adapts: AI-designed evasion is a temporary problem; screening improvements restore the chokepoint.

If screening falls behind: The primary technical barrier erodes; other defenses must compensate.

Key questions:

Can screening adapt to AI-designed evasive sequences?
What happens as benchtop synthesis equipment becomes cheaper and more accessible?
Can screening extend to cover novel synthesis methods and cloud laboratories?

The Framework for Nucleic Acid Synthesis Screening↗ (April 2024) represents a policy response, but only applies to federally funded programs.

Current Evidence

Studies have shown Large Language Models can provide information relevant to bioweapon development, though the significance is contested.

RAND Red-Team Study (2024)

The RAND Corporation study ("The Operational Risks of AI in Large-Scale Biological Attacks") is reportedly one of the more rigorous empirical assessments of AI uplift conducted to date.¹ Researchers Christopher Mouton, Caleb Lucas, and Ella Guest reportedly recruited 15 groups of three people to act as Red Teaming "bad guys."¹

According to the study, twelve teams were given 80 hours each over seven weeks to develop bioweapon attack plans based on one of four scenarios—including a "fringe doomsday cult intent on global catastrophe" and a "private military company seeking to aid an adversary's conventional military operation."¹ For each scenario, one team had access to an Large Language Models chatbot, another had a different chatbot, and control teams used only internet resources.¹

Expert judges (biologists and security specialists) evaluated the resulting plans for biological and operational feasibility. The reported result: no statistically significant difference in plan viability between AI-assisted and non-AI groups.¹

Key methodology details:

Participants had some technical background (science graduates)
Testing focused on planning, not actual synthesis
Used 2023-era models; capabilities have advanced since
Sample size was relatively small (n=12 teams completing the study)
LLMs did not generate explicit weaponization instructions, but reportedly provided "guidance and context in critical areas such as agent selection, delivery methods, and operational planning"¹

Limitations acknowledged by researchers: The study tested planning capability, not execution. It used participants with technical backgrounds, so may underestimate uplift for complete novices. AI capabilities continue advancing.

Implications: The wet-lab bottleneck may be more significant than the knowledge bottleneck. Knowing how to make something is different from being able to make it.

AI-Designed Toxins Evade Screening (2024)

Microsoft researchers reportedly conducted a red-team exercise testing biosecurity in the protein engineering pipeline. According to some sources, DNA screening software—used by synthesis companies to flag dangerous sequences—missed over 75% of AI-designed potential toxins, with one tool flagging only 23% of sequences.² After the research was published, screening systems reportedly improved to catch approximately 72% on average.²

Key details:

Tested multiple commercial screening tools
AI reportedly designed functional variants that differed sufficiently from known threats to evade pattern matching
Improvement after publication shows screening can adapt—but also shows it wasn't keeping pace

Implications: Even if current LLMs provide limited knowledge uplift, AI protein design tools may create harder-to-detect threats. The screening ecosystem has significant gaps that AI can exploit.

Gryphon Scientific Evaluation (2023)

Anthropic hired Gryphon Scientific to red-team Claude's ability to provide harmful biological information.³ According to reports, the evaluation involved more than 150 hours of testing and drew on more than 20 biosecurity experts.³

The findings were described as concerning. Rocco Casagrande, Gryphon's managing director, reportedly stated he was "personally surprised and dismayed by how capable current LLMs were at providing critical information related to biological weapons."³ He was quoted by Semafor as saying: "These things are developing extremely, extremely fast, they're a lot more capable than I thought they would be when it comes to science."³

Key findings (according to reports):

One team member with a postdoctoral fellowship studying a pandemic-capable virus found LLMs could provide "post-doc level knowledge to troubleshoot commonly encountered problems" when working with that virus
For low-skill users, LLMs could suggest which viruses to acquire
Although LLMs often hallucinate, they answered almost all questions accurately at least sometimes, and answered some critical questions nearly always accurately
Workshops with biosecurity experts identified concerning misuse scenarios including how to reconstruct information redacted from sensitive scientific documents

Despite the concerning findings, Casagrande reportedly believes "concerted action could ensure safety is built into the most advanced models."³

Anthropic, OpenAI Evaluations

AI labs have conducted extensive internal evaluations testing whether their models could provide "uplift" to potential bioweapon developers.

Anthropic's approach: Anthropic's Responsible Scaling Policies (RSP) defines AI Safety Levels (ASL) modeled after biosafety level (BSL) standards.⁴ They reportedly conduct at least 10 different biorisk evaluations for each major model release.⁴ In early 2025, Anthropic reportedly sent a letter to the White House urging immediate action on AI security after its testing revealed alarming improvements in Claude 3.7 Sonnet's ability to assist with aspects of bioweapons development.⁵

OpenAI's framework: OpenAI's Preparedness Framework categorizes biological and chemical capabilities as "Tracked Categories" requiring ongoing evaluation.⁶ They define two thresholds:

High capability: Could "provide meaningful counterfactual assistance to 'novice' actors (anyone with a basic relevant technical background) that enables them to create known biological or chemical threats"⁶
Critical capability: Could "introduce unprecedented new pathways to severe harm"⁶

OpenAI states their most advanced models "aren't yet capable enough to pose severe risks" in biosecurity—but has reportedly indicated upcoming models may reach "high" capability level.⁶

US/UK AI Safety Institute joint evaluation (2024): The first joint government-led model evaluation tested Claude 3.5 Sonnet across biological capabilities, cyber capabilities, software development, and safeguard efficacy.⁷ Elizabeth Kelly, AISI director, was quoted as calling it "the most comprehensive government-led safety evaluation of an advanced AI model to date."⁷

Evaluation Methodology Limitations

An Epoch AI analysis of biorisk evaluations across major AI labs identified significant methodological concerns:

Lab	Benchmark Share	Red Teaming	Uplift Trials
Anthropic	≈40%	Yes	Yes (only lab with text-based trials)
OpenAI	≈50%	Yes	No
Google DeepMind	≈80%	No	No

Key findings (according to Epoch AI):

Most publicly described biorisk benchmarks have "rapidly saturated"—AI systems now exceed expert-human baselines
Benchmarks "practically always fail to capture many real-world complexities"
Anthropic is the only frontier lab conducting explicit biorisk uplift trials
Despite limitations, Epoch AI concluded Anthropic was "largely justified" in activating ASL-3

Kevin Esvelt's Classroom Experiment

MIT researcher Kevin Esvelt reportedly conducted an informal demonstration in which he asked students to use ChatGPT or other LLMs to identify dangerous pathogens. According to some accounts, after approximately one hour, the class had identified four potential pandemic pathogens, methods to generate them from synthetic DNA, names of DNA synthesis companies unlikely to screen orders, and detailed protocols and troubleshooting guidance.

Esvelt was quoted regarding AI's ability to circumvent DNA screening defenses: "We've built a Maginot Line of defense, and AI just walked around it."

This demonstration, while not a rigorous study, illustrates how quickly accessible LLMs can be leveraged for potentially dangerous information-gathering—even for those without prior expertise.

CNAS Report: AI and Biological National Security Risks (2024)

The Council on Strategic Risks report by Bill Drexel and Caleb Withers provides a comprehensive analysis of the evolving AI-biosecurity landscape.

Key concerns identified:

AI could enable bioterrorism, create unprecedented superviruses, and develop novel targeted bioweapons
AI's potential to "optimize bioweapons for targeted effects, such as pathogens tailored to specific genetic groups or geographies, could significantly shift states' incentives to use biological weapons"
If realized, such threats could "expose the United States to catastrophic threats far exceeding the impact of COVID-19"

Key recommendations:

Strengthen screening mechanisms for cloud labs and genetic synthesis providers
Conduct rigorous assessments of foundation models' biological capabilities throughout the bioweapons lifecycle
Invest in technical safety mechanisms to curb threats posed by foundation models
Consider a licensing regime for biological design tools with potentially catastrophic capabilities

The report emphasizes that while AI-enabled biological catastrophes are "far from inevitable," current biological safeguards already need significant updates.

2025–2026 Developments: A Pivotal Period

2025 marked a significant shift in how AI labs and governments assess biological risks. According to the Council on Strategic Risks: "The year 2025 brought rising public awareness and discussion of the risks at the AI-biology nexus."

Development	Date	Significance
Evo2 biological AI model released	Feb 2025	Reportedly trained on 128,000+ genomes
FRI expert survey published	Feb 2025	Surveyed approximately 46 experts and 22 superforecasters on AI-bio risk
OpenAI's o3 virology benchmark	Apr 2025	Reportedly scored at approximately 94th percentile on a virology capabilities test
Anthropic ASL-3 activation	May 2025	First reported use of highest safety tier, for Claude Opus 4
US AI Action Plan biosecurity chapter	Jul 2025	Federal recognition of AI-enabled pathogen risk
UN AI governance bodies formalized	Sep 2025	Scientific Panel and Global Dialogue established
DNA screening patch deployed globally	Oct 2025	Reportedly achieving approximately 97% detection rate
Epoch AI evaluation analysis	2025	Found benchmark saturation across labs

Several specific developments stand out:

OpenAI's High-Risk Classification

OpenAI reportedly announced that upcoming models—particularly successors to the o3 reasoning model—may trigger "high-risk classification" under its Preparedness Framework. This would mean they could provide "meaningful counterfactual assistance to novice actors" in creating known biological threats.

Key points from OpenAI's approach (according to reports):

Classified ChatGPT Agent as having "High capability in the biological domain"
Discovered that creating bioweapons would require weeks or months of sustained AI interaction, not single conversations
Implemented a traffic-light system: red-level content (direct bioweapon assistance) is immediately blocked; yellow-level content (dual-use information) requires careful handling

Anthropic's ASL-3 Activation (May 2025)

Anthropic reportedly became the first lab to activate its highest safety tier (ASL-3) specifically for biological concerns when releasing Claude Opus 4. Their internal evaluations reportedly found they "could no longer confidently rule out the ability of our most advanced model to uplift people with basic STEM backgrounds" attempting to develop CBRN weapons.

Anthropic's testing reportedly revealed:

Participants with access to Claude Opus 4 developed bioweapon acquisition plans with "substantially fewer critical failures" than internet-only controls
Claude went from underperforming world-class virologists to "comfortably exceeding that baseline" on virology troubleshooting within a year

National Academies Report (March 2025)

The National Academies of Sciences, Engineering, and Medicine published "The Age of AI in the Life Sciences: Benefits and Biosecurity Considerations," reportedly directed by Executive Order 14110. Key findings included:

AI-enabled biological tools can improve biosecurity through enhanced surveillance and faster countermeasure development
Current biological design tools can design simpler structures (molecules) but cannot yet design self-replicating pathogens
A "distinct lack of empirical data" exists for evaluating biosecurity risks of AI-enabled biological tools
Recommended continued investment alongside monitoring for potential risks

CSIS Policy Analysis (August 2025)

The Center for Strategic and International Studies reportedly published "Opportunities to Strengthen U.S. Biosecurity from AI-Enabled Bioterrorism," warning that current U.S. biosecurity measures are "ill-equipped to meet these challenges." The report noted that critical safeguards in biological design tools are "already circumventable post-deployment."

Supplementary Evidence

Source	Finding	Implications
National Academies (2025)	BDTs cannot yet design self-replicating pathogens	Current tools limited; monitoring needed
CSIS Report (2025)	Current biosecurity measures inadequate	Policy urgently needs updating
OpenAI Preparedness (2025)	Next-gen models may hit "high-risk"	Frontier labs anticipate near-term uplift
Anthropic ASL-3 (2025)	Cannot rule out CBRN uplift for novices	First reported activation of highest safety tier
DeepSeek testing (2025)	Open-source models reportedly lack equivalent safeguards	Proliferation concern raised
CNAS Report (2024)	AI-bio integration is emerging risk	Supports compound capability concern

How AI Could Help Attackers

AI could assist at multiple stages of bioweapon development:

Attack Chain Analysis

A successful biological attack requires success across multiple stages, each with independent failure modes:

Diagram (loading…)

flowchart TD
  A[Motivation] --> B[AI/Information Access]
  B --> C[Knowledge Uplift]
  C --> D[Lab Access]
  D --> E[Synthesis Success]
  E --> F[Deployment]
  F --> G[Evades Countermeasures]
  G --> H[Catastrophic Attack]

  style A fill:#fee
  style H fill:#fcc

Stage	AI Contribution	Traditional Difficulty	AI Changes What
Motivation	None	Present	—
Information access	High	Moderate	Reduces search time
Knowledge uplift	Low-Moderate	High	Bridges expertise gaps
Lab access	None	High	—
Synthesis	None (currently)	Very High	Future: could guide procedures
Deployment	Low	High	Could optimize dispersal
Evading countermeasures	Moderate	Variable	Could design novel variants

See Bioweapons Attack Chain Model for detailed probability estimates at each stage.

Specific Assistance Pathways

Target identification — AI might help identify dangerous modifications to known pathogens or find novel biological agents. Large language models trained on scientific literature have extensive knowledge of pathogen biology.

Synthesis planning — AI could help determine how to create dangerous biological materials. Protein design tools can generate novel sequences, and LLMs can explain synthesis routes.

Knowledge bridging — Most concerningly, AI might help bridge knowledge gaps. Historically, bioweapons development required rare combinations of expertise. AI could help a motivated individual or small group compensate for missing knowledge, potentially replacing what previously required teams of specialists.

Evasion optimization — AI could help design pathogens or synthesis routes that evade detection by screening tools, surveillance systems, or medical countermeasures.

History & Current Infrastructure

Biological threats exist on a spectrum. State programs have historically been the main concern, but the barrier to entry may be dropping. The COVID-19 pandemic demonstrated how much damage pathogens can cause and highlighted gaps in biosecurity infrastructure.

Historical Programs

State Bioweapons Programs

Multiple nations have maintained offensive biological weapons programs despite the Biological Weapons Convention (BWC):⁸

Program	Era	Scale	Outcome
US	1943–1969	Large	Unilaterally terminated by Nixon
Soviet Union	1928–1992	Massive (reportedly 30,000–40,000 staff)	Collapsed with USSR; concern about residual capabilities and scientist emigration
Japan (Unit 731)	1937–1945	Large	Defeated in WWII; perpetrators granted immunity by US in exchange for data
Iraq	1980s–1990s	Moderate	Dismantled after Gulf War; revealed extensive program
South Africa	1981–1993	Moderate	Dismantled post-apartheid; included ethnic targeting research

These programs required vast resources, thousands of scientists, and state-level infrastructure. The concern is that AI could reduce these requirements.

Current compliance concerns: According to some sources, the 2024 State Department report raised BWC compliance concerns about China, Russia, Iran, and North Korea.⁹ Verification remains difficult because the BWC has no formal verification regime.⁸

The Soviet Biopreparat Program: A Case Study

The Soviet Union reportedly operated one of the world's largest biological weapons programs—in direct violation of the BWC it had signed.⁸ Understanding this program illuminates both the scale of resources historically required and the ongoing legacy concerns.

Scale and organization:

Biopreparat↗ was reportedly created in April 1974 as a civilian cover organization¹⁰
Reportedly employed 30,000–40,000 personnel across some 40–50 research facilities, according to accounts by former program insiders¹⁰
Reportedly included five major military-focused research institutes, numerous design facilities, three pilot plants, and five dual-use production plants¹⁰
Annual production capacity for weaponized smallpox was reportedly on the order of 90–100 tons, according to defector accounts¹¹

Agents developed:

Weaponized smallpox (reportedly continued even after WHO declared global eradication)
Anthrax (including strains developed as enhanced "battle" variants)
Plague, Q fever, tularemia, glanders, and Marburg hemorrhagic fever
Agents reportedly designed for aerosol dispersal via ballistic or cruise missiles¹⁰

The Sverdlovsk incident (1979): An accidental release of anthrax spores from a Soviet military facility in Sverdlovsk reportedly killed at least 68 people; the true number remains uncertain because KGB records were reportedly destroyed.¹² The Soviet government initially attributed deaths to contaminated meat; Boris Yeltsin publicly acknowledged the military origin in 1992.¹²

Key defectors who revealed the program:

Vladimir Pasechnik (defected 1989): Described as a high-level defector to the UK; his reported testimony enabled Western leaders to pressure Gorbachev about the program's scope¹³
Ken Alibek (Kanatjan Alibekov, defected 1992): Described as a former first deputy director of Biopreparat; after emigrating he reportedly provided US government with a detailed accounting of the program, including work on tularemia and enhanced anthrax strains¹¹

Legacy concerns:

Some facilities and scientists were reportedly absorbed into public health institutions after the USSR's dissolution
US programs attempted to redirect former weapons scientists to peaceful research
According to contemporaneous reporting, in late 1997 the US expanded efforts after detecting what officials described as intensified attempts by Iran and other states to acquire biological expertise from former Soviet institutes¹⁴

Lesson for AI risk: Even with massive state resources, Biopreparat reportedly required decades and thousands of scientists to develop reliable weapons. This suggests the wet-lab barrier is formidable—but also that determined state actors with existing infrastructure could integrate AI assistance more easily than non-state actors starting from scratch.

Non-State Actor Attempts

The historical record of non-state biological attacks reveals consistent technical failures despite significant motivation and resources:

1984 Oregon Salmonella Attack (Rajneeshees)

Members of the Rajneeshee religious commune deliberately contaminated restaurant salad bars in The Dalles, Oregon with Salmonella typhimurium
According to CDC records, the attack caused 751 cases of food poisoning and 45 hospitalizations; there were no deaths¹⁵
The attack occurred in 1984 and remains the largest confirmed bioterrorist attack in U.S. history¹⁵
It used a readily available pathogen requiring no sophisticated laboratory technology
Key insight: Demonstrated that biological attacks don't require advanced technology, but also that impact was limited without sophisticated delivery

Aum Shinrikyo (1990s)

Japanese cult with reportedly $1 billion in assets, hundreds of members, and PhD-level scientists¹⁶
Attempted anthrax, botulinum toxin, and other biological agents—all efforts reportedly failed to produce casualties¹⁶
An anthrax sprayer reportedly deployed in Tokyo produced no casualties, attributed partly to use of a vaccine strain by mistake¹⁶
The group eventually succeeded with a sarin chemical attack in the Tokyo subway in 1995, killing 13 people and injuring thousands¹⁷
Key insight: Even well-funded, technically sophisticated groups with scientific personnel have failed at biological weapons. The wet-lab barrier is real.

2001 Anthrax Letters (Amerithrax)

Letters containing anthrax spores killed 5 people and infected 17 others in the United States¹⁸
The FBI concluded the perpetrator was Bruce Ivins, a senior scientist at USAMRIID with decades of anthrax research experience and legitimate institutional access to spores¹⁸
Key insight: An insider threat—not information access—enabled this attack. The perpetrator already possessed world-class expertise; AI would not have been the limiting factor.

Why has catastrophic bioterrorism not occurred?

Factor	Explanation
Technical difficulty	Synthesis, production, and weaponization require tacit knowledge
Pathogen handling	Dangerous to the attacker; requires safety infrastructure
Delivery challenges	Aerosol dispersion is technically demanding
Attribution risk	Genomic analysis increasingly enables source identification
Goal mismatch	Most terrorist groups want publicity, not mass extinction
Limited access	Dangerous pathogens are controlled; acquisition is difficult

This historical record could indicate either genuine difficulty (the barriers are high) or luck (we've been fortunate). The precautionary argument is that AI could systematically lower multiple barriers simultaneously, changing the calculus even if each individual barrier remains partially intact.

Current Biosecurity Infrastructure

DNA synthesis companies already screen orders for dangerous sequences, but screening isn't comprehensive:

Defense Layer	Coverage	Effectiveness	AI Vulnerability
DNA synthesis screening	Major companies	Reportedly 40–70% (pre-2024); improving¹⁹	High (evasion design)
BSL facility access control	High containment	High	Low
Pathogen inventory tracking	Research labs	Moderate	Low
Export controls (equipment)	Dual-use items	Moderate	Low
Disease surveillance	Advanced countries	Moderate–High	Moderate
Medical countermeasures	Known pathogens	Moderate	Moderate (novel agents)

DNA Synthesis Screening: The Critical Chokepoint

DNA synthesis screening is considered the key "chokepoint" in the AI-assisted bioweapons pipeline—if dangerous sequences can be intercepted before synthesis, attacks become much harder. However, significant gaps remain:

Current limitations:

Participation in the International Gene Synthesis Consortium (IGSC) is voluntary—not all companies are members
Regulations are inconsistent between countries
Screening relies on matching against databases of known dangerous sequences—novel variants can evade detection
High false positive rates require expensive human review
Benchtop DNA synthesizers are emerging that could bypass commercial screening entirely

Post-Microsoft patch status: After research revealed high evasion rates against existing screening tools, a software patch was deployed to synthesis companies. According to reporting on that effort, the fix reportedly now catches approximately 97% of threats—but experts have cautioned that the fix remains incomplete and gaps persist.²⁰

Policy response: In April 2024, the White House OSTP released a Framework for Nucleic Acid Synthesis Screening↗, requiring federally funded programs to screen customers and orders, keep records, and report suspicious orders. NIST is partnering with stakeholders to improve screening standards and mitigate AI-specific risks.

Emerging Defensive Infrastructure

SecureDNA: A Swiss foundation↗ providing free, privacy-preserving DNA synthesis screening that already exceeds 2026 regulatory requirements. SecureDNA screens sequences below the 50 base pair length using a "random adversarial threshold" algorithm designed to be more robust against AI-designed evasion.

Nucleic Acid Observatory (NAO): A collaboration between SecureBio and MIT↗ pioneering pathogen-agnostic early warning through deep metagenomic sequencing. Unlike traditional surveillance that looks for known pathogens, NAO aims to detect new and unknown pathogens through wastewater and pooled nasal swab sampling. SecureBio's "Delay, Detect, Defend" strategy: Kevin Esvelt's SecureBio organization↗ works on multiple defensive layers:

Delay: Synthesis screening and access controls
Detect: Early warning systems like the NAO
Defend: Societal resilience through germicidal UV light, pandemic-proof PPE stockpiles, and rapid countermeasure development

Emerging Technologies of Concern

Several emerging technologies could compound AI-enabled biosecurity risks by removing barriers that currently limit attack feasibility:

Benchtop DNA Synthesizers

A new generation of desktop DNA synthesis devices may enable users to print DNA in their own laboratories, potentially bypassing commercial screening entirely.

Current products:

Kilobaser↗: Personal DNA/RNA synthesizer, reportedly measuring 27×33×33 cm, producing oligos in approximately 30–50 minutes with around 2.5 min/base turnaround, according to manufacturer specifications
DNA Script SYNTAX System↗: Enzymatic DNA synthesis (water-based, avoiding harsh chemicals), reportedly supporting 96 parallel oligos up to 120 nucleotides per the company's published materials
Evonetix Evaleo↗: Gene-length DNA synthesis on silicon chips, with the company claiming speeds approximately 10× faster than current technologies
BioXp (Telesis Bio): Commercial benchtop synthetic biology workstation automating pipetting, mixing, thermal cycling, purification, and storage

Current limitations:

According to some sources, most benchtop devices are limited to sequences under 120 base pairs—insufficient for most dangerous applications
Not yet viable alternatives to centralized DNA providers for gene-length sequences
Quality control and yield often inferior to commercial synthesis Biosecurity implications:
NTI analysis↗ reportedly notes that "three converging technological trends—enzymatic synthesis, hardware automation, and increased demand from computational tools—are likely to drive rapid advancement in benchtop capabilities over the next decade"
Manufacturers should implement rigorous sequence screening for each fragment produced
Governments should provide clear regulations for manufacturers to incorporate screening
Once capabilities exceed current limits, benchtop devices could become a significant biosecurity gap

Cloud Laboratories

Cloud laboratories↗ are heavily automated, centralized research facilities where scientists run experiments remotely from computers. They present unique biosecurity challenges:

How cloud labs lower barriers:

Reduce technical skill requirements by automating complex procedures
Enable "one-stop-shop" research that could expand the pool of capable actors
Allow experiments to be performed remotely, potentially bypassing ethical constraints in traditional academic settings
Researchers retain full control over experimental design without physical presence

Current governance gaps:

No public data on cloud lab operations, workflows, customer numbers, or locations worldwide
No standardized approaches for customer screening shared between organizations
Cybersecurity laws don't account for unique vulnerabilities of biological data and lab automation systems
Biosafety regulations typically neglect digital threats like remote manipulation of synthesis machines

Proposed solutions (RAND↗):

Create a Cloud Lab Security Consortium (CLSC) modeled on the International Gene Synthesis Consortium (IGSC) for DNA synthesis
Minimum security standards: customer screening, controlled substance access, experiment screening, secured networks
Human-in-the-loop controls when AI systems place synthesis orders for sequences of concern

Biological Design Tools (BDTs)

Beyond LLMs, specialized biological design tools present distinct risks:

AlphaFold↗ and protein structure prediction:

Revolutionary tool for predicting protein structure from genetic sequence; according to some sources, achieving over 90% accuracy on benchmark datasets
Could enable optimization of existing hazards: increasing toxicity, improving immune evasion, enhancing transmissibility
Could potentially enable design of completely novel toxins targeting human proteins
Google DeepMind reportedly engaged more than 50 domain experts in biosecurity assessment during development of AlphaFold 3, according to published accounts
Implements experimental refusal mechanisms to block misuse—but biological design often resides in dual-use space

Other BDT concerns:

Machine learning for prediction of host range, transmissibility, and virulence
Generative models for novel agent design
Tools that help design sequences evading DNA screening (as demonstrated in published Microsoft research)

Dual-use nature: Unlike LLM guardrails, where harmful requests are often clearly distinguishable, biological design tool queries are frequently dual-use. The same protein optimization that could enhance a therapeutic could theoretically enhance a toxin. This makes technical controls more difficult than for text-based LLMs.

Policy recommendations (UNICRI↗):

Prerelease evaluation requirements for advanced biological models regardless of funding source
Prioritize mitigating risks of pathogens capable of causing major epidemics
Preserve researcher autonomy while implementing targeted controls on highest-risk capabilities

Research Governance & International Law

AI-enabled bioweapons risk exists within a broader context of biosecurity challenges, including ongoing debates about research oversight and international governance gaps.

Gain-of-Function and Enhanced Pandemic Pathogen Research

Gain-of-function (GoF) research—experiments that enhance pathogen transmissibility, virulence, or host range—has become intensely controversial, with implications for AI-biosecurity debates:

Recent policy developments:

May 2024: The White House Office of Science and Technology Policy released the "Policy for Oversight of Dual Use Research of Concern and Pathogens with Enhanced Pandemic Potential" (DURC/PEPP Policy).
May 2025: An executive order reportedly blocked the 2024 policy the day before it was scheduled to take effect.
Ongoing: NIH reportedly identified more than 40 projects that may meet definitions of dangerous GoF research and, according to some sources, demanded scientists suspend work.

Congressional activity:

The House approved a ban on federal funding for GoF research modifying risky pathogens.
Scientific groups warn that vaguely worded provisions could unintentionally halt flu vaccine development and other beneficial research.
The Risky Research Review Act (S. 854, H.R. 1864) would establish a life sciences research security board.

Key limitation: Both the 2014 DURC Policy and the 2024 PEPP Policy apply only to government-funded research. Extending coverage to privately funded research would require new regulations or legislation. AI labs developing biological design tools with private funding currently face no equivalent oversight requirements.

Relevance to AI risk: The GoF debate previews challenges AI governance will face:

Distinguishing beneficial from dangerous research is difficult.
Oversight mechanisms are primarily voluntary and apply only to government-funded work.
International coordination is lacking.
Technical definitions ("gain of function," "enhanced pandemic potential") are contested.

The Biological Weapons Convention: Structural Weaknesses

The Biological Weapons Convention (BWC), opened for signature in 1972, prohibits the development, production, and stockpiling of biological weapons. As of the most recent review cycle it has 187 states parties. Despite its broad membership, the treaty has significant structural weaknesses.

No verification regime:

Unlike chemical and nuclear weapons agreements, the BWC contains no formal verification provisions.
Attempts to develop a verification protocol collapsed in 2001 after years of negotiation.
According to some analysts, governments effectively ceased substantive discussion of verification within the treaty framework for over two decades following that failure.

Minimal institutional support:

The BWC Implementation Support Unit has only four staff members.
Its budget is, according to some sources, smaller than that of an average McDonald's restaurant—a comparison attributed to philosopher Toby Ord.
By contrast, the IAEA employs more than 2,500 staff and the OPCW more than 500 staff.

Recent developments:

December 2022: States Parties established a Working Group on Strengthening the Convention.
2024: The fourth and fifth Working Group sessions were held in August and December 2024.
December 2024: The fifth session reportedly "ended with a regrettable conclusion in which a single States Party undermined the noteworthy progress achieved"—a setback described by the Council on Strategic Risks.
The Working Group has reportedly only seven days of scheduled time through the end of 2025 allocated specifically for verification discussion.

Practical limitations:

No politically palatable, technologically feasible, and financially sustainable system can guarantee detection of all biological weapons programs.
Rapid advances in biotechnology create new verification challenges.
AI capabilities could make verification even more difficult by enabling novel agent design.

What's possible: While perfect verification is unachievable, analysts including those writing in the Bulletin of the Atomic Scientists have argued that "measures in combination could generate considerably greater confidence in compliance by BWC states parties."

Defensive Technologies and Pandemic Preparedness

The same technological advances that could enable attacks also offer powerful defensive capabilities. Many experts believe defense will ultimately win the offense-defense balance—the question is whether we're in a dangerous transition period.

mRNA Vaccine Platforms

The COVID-19 pandemic demonstrated the transformative potential of mRNA vaccines for rapid response:

Speed advantages:

Traditional vaccines require time-consuming manufacturing with live pathogens
mRNA vaccines can be designed in days once a pathogen's genetic sequence is known
COVID-19 mRNA vaccines received FDA Emergency Use Authorization in under one year—unprecedented in vaccine history
CEPI's "100 Days Mission" aims to develop safe, effective vaccines against novel threats within 100 days of a pandemic being declared

Manufacturing advantages:

Cell-free manufacture enables accelerated, scalable production
Standardizable processes require minimal facility adaptations between products
Smaller manufacturing footprints than traditional vaccines
Same facility can produce multiple vaccine products

Safety profile:

mRNA does not enter the cell nucleus and cannot integrate into the cellular genome
Can be administered repeatedly without triggering anti-vector immunity (unlike viral vector vaccines)
Avoids live pathogen handling in manufacturing

Pandemic preparedness implications:

Platform is "pathogen-agnostic"—the same technology works against any target with a known sequence
BARDA and CEPI are reportedly supporting development of dozens of vaccine candidates against high-risk pathogens
Next-generation "trans-amplifying" mRNA vaccines↗ under development could provide stronger immune responses at lower doses

For AI-bioweapons specifically: Rapid vaccine development could limit the damage from engineered pathogens if detected early. However, novel agents designed to evade detection or existing countermeasures would still pose severe risks during the response window.

Metagenomic Surveillance

Traditional disease surveillance looks for known pathogens. Metagenomic sequencing offers pathogen-agnostic detection:

How it works:

Deep sequencing of all genetic material in samples (wastewater, nasal swabs, etc.)
Computational analysis identifies viral, bacterial, and other sequences
Can detect novel or unexpected pathogens that would not be caught by targeted testing

Current research:

Nucleic Acid Observatory (NAO)↗: Sequencing wastewater from major US airports and treatment plants to establish pathogen-agnostic baselines
One published dataset comprised reportedly 13.1 terabases sequenced from 20 wastewater samples collected at the Los Angeles Hyperion treatment plant, which serves approximately 4 million residents
A Lancet Microbe publication↗ established sensitivity models for wastewater metagenomic sequencing (W-MGS) detection

Sensitivity and cost tradeoffs:

Untargeted shotgun sequencing is less sensitive than targeted methods for known pathogens
Hybridization capture panels can greatly increase sensitivity for viruses included in the panel, but may reduce sensitivity to entirely unknown pathogens
Large variation in viral detection exists based on sewershed hydrology and laboratory protocols
Modeled sensitivity for certain bacterial pathogens has been estimated at roughly 1 infected person detectable among 257–2,250 individuals in a sewershed, according to published sensitivity analyses

For AI-bioweapons specifically: Metagenomic surveillance could provide early warning for engineered pathogens that evade targeted detection. However, sensitivity limits mean outbreaks may need to reach significant scale before detection occurs.

Far-UVC Germicidal Light

Far-UVC↗ light, operating in the 200–235 nm wavelength range, is emerging as a potentially transformative technology for airborne pathogen inactivation in occupied spaces:

Why it's different from conventional UV:

Conventional germicidal UV-C (254 nm) harms human skin and eyes—restricting its use to upper-room applications or unoccupied spaces
Far-UVC (typically 222 nm) is absorbed in the outer dead layer of human skin and in the tear layer of the eyes, and cannot penetrate to living tissue
This property enables direct disinfection of the breathing zone while people are present

Efficacy:

A very low dose of 2 mJ/cm² of 222-nm light has been reported to inactivate more than 95% of airborne H1N1 influenza virus in laboratory conditions
Studies suggest a single far-UVC fixture can deliver the equivalent of 33–66 air changes per hour for pathogen removal
Far-UVC has been tested against tuberculosis, SARS-CoV-2, influenza, and murine norovirus, with reported reductions of up to 99.8% for murine norovirus
A 2025 review characterized far-UVC as having "high ability" to kill pathogens with a "high level of safety," though the authors noted that long-term human exposure data remain limited

Applications for pandemic preparedness:

Installation in hospitals, schools, airports, and public transit could dramatically reduce airborne transmission
Blueprint Biosecurity↗ is reportedly funding research teams to evaluate deployment in real-world scenarios
Coefficient Giving has issued an RFI on far-UVC evaluation↗
NIST is reportedly collaborating with industry on standards development

Remaining questions:

Long-term human exposure effects require further research
Real-world efficacy in varied building environments is not yet fully characterized
Cost and feasibility of widespread deployment remain open questions

For AI-bioweapons specifically: Far-UVC could provide a layer of defense against aerosol-dispersed biological agents in public spaces. Even if attackers successfully synthesize and deploy pathogens, widespread far-UVC installation could limit transmission and buy time for medical countermeasure deployment.

Mitigations

Model-Level Interventions

Refusals and filtering — Training models not to help with bioweapon development and filtering dangerous outputs. But these are imperfect—models can be jailbroken, fine-tuned, or open-source models may lack restrictions entirely.

Effectiveness assessment:

Reduces casual misuse
Raises barrier for unsophisticated actors
Does not prevent determined actors with technical skills
Cannot address open-source model proliferation

Evaluations before deployment — Testing models for biosecurity risks during development, as part of responsible scaling policies. Useful but relies on labs' good faith and competence.

AI-Specific Governance

Compute governance — Limiting who can train powerful models reduces the availability of capable models to bad actors. Information security around model weights becomes important if models can provide meaningful uplift.

Biological capability thresholds — Anthropic's RSP and similar frameworks establish biological capability as a key threshold for enhanced safety measures. This creates systematic evaluation requirements.

Open-source restrictions — Limiting the release of model weights for systems with significant biological knowledge. Controversial due to benefits of open research.

Broader Biosecurity Measures

Broader biosecurity measures may matter more than AI-specific interventions:

Intervention	Cost	Risk Reduction	Priority
DNA synthesis screening	≈$100M/year	5-15%	High
Metagenomic surveillance	≈$500M/year	15-25%	Very High
BSL facility security	≈$200M/year	5-10%	High
Pandemic response stockpiles	≈$2B/year	10-20%	Medium-High
International verification	≈$300M/year	3-8%	Medium

DNA synthesis screening — Flagging dangerous sequences before synthesis. The primary defense but has significant gaps that AI can exploit.

Laboratory access controls — Restricting who can work with dangerous pathogens. Effective for legitimate facilities; doesn't address improvised labs.

Disease surveillance — Early detection of outbreaks. Benefits from AI advances and may be where AI provides greatest defensive value.

Medical countermeasures — Rapid vaccine and treatment development. mRNA platforms demonstrated during COVID-19 show how quickly responses can be developed.

Timeline

Date	Event
1972	Biological Weapons Convention signed (now 187 states parties)
1984	Rajneeshee salmonella attack—751 casualties, largest US bioterrorist attack
1995	Aum Shinrikyo attempts bioweapons (anthrax, botulinum), fails; uses sarin instead
2001	Anthrax letters kill 5, infect 17; perpetrator was an insider with legitimate access
2020	Toby Ord publishes Toby Ord, estimating 1/30 existential risk from engineered pandemics
2020-21	COVID-19 demonstrates pandemic potential; exposes biosecurity gaps
2022	Collaborations Pharmaceuticals shows AI drug discovery model can generate novel toxic molecules in hours
2023 (July)	Dario Amodei warns of "substantial risk" AI will enable bioattacks within 2-3 years
2023 (Nov)	Gryphon Scientific red-team finds Claude provides "post-doc level" biological knowledge
2024 (Jan)	RAND red-team study finds no significant AI uplift for bioweapon planning
2024 (Apr)	White House OSTP releases Framework for Nucleic Acid Synthesis Screening
2024 (May)	Microsoft research reveals 75%+ of AI-designed toxins evade DNA screening
2024 (Aug)	CNAS publishes report on AI and biological national security risks
2024 (Aug)	US AI Safety Institute signs agreements with Anthropic and OpenAI for biosecurity evaluation
2024 (Oct)	Executive Order 14110 directs National Academies to study AI biosecurity
2024 (Nov)	US/UK AI Safety Institutes conduct first joint model evaluation (Claude 3.5 Sonnet)
2024 (Dec)	Anthropic RSP includes 10+ biological capability evaluations per model
2025 (Jan)	Anthropic sends letter to White House citing "alarming improvements" in Claude 3.7 Sonnet
2025 (Feb)	Anthropic CEO reports DeepSeek was "the worst" model tested for biosecurity safeguards
2025 (Mar)	National Academies publishes "The Age of AI in the Life Sciences" report
2025 (Apr)	OpenAI's o3 model ranks 94th percentile among expert virologists on capability test
2025 (May)	Anthropic activates ASL-3 protections for Claude Opus 4 due to CBRN concerns
2025 (Jun)	OpenAI announces next-gen models will hit "high-risk" biological classification
2025 (Jul)	OpenAI hosts biodefense summit with government researchers and NGOs
2025 (Jul)	Trump administration's AI Action Plan identifies biosecurity as priority
2025 (Aug)	CSIS publishes "Opportunities to Strengthen U.S. Biosecurity from AI-Enabled Bioterrorism"
2025 (Sep)	UN formalizes International Scientific Panel on AI and Global Dialogue on AI Governance
2025 (Oct)	Microsoft publishes Science paper; screening patch deployed globally (97% effective)
2025 (Oct)	Hoover Institution warns biotech+AI is "one of the biggest emerging security threats"
2025 (Dec)	Council on Strategic Risks publishes "2025 AIxBio Wrapped" year-in-review
2026 (Jan)	Epoch AI finds biorisk benchmarks have "rapidly saturated" across frontier labs

Expert Perspectives

Expert opinion on AI-bioweapons risk is divided, with prominent voices on both sides:

Those More Concerned

Kevin Esvelt (MIT): One of the most vocal experts on AI-biosecurity risks. Esvelt emphasizes that if you ask a chatbot how to cause a pandemic, "it will suggest the 1918 influenza virus. It will even tell you where to find the gene sequences online and where to purchase the genetic components." He co-founded SecureDNA and SecureBio to address these risks.

Dario Amodei (Anthropic CEO): In July 2023, stated there was a "substantial risk" that within 2-3 years, AI would "greatly widen the range of actors with the technical capability to conduct a large-scale biological attack." In February 2025, reported that DeepSeek was "the worst" model tested for biosecurity, generating information "that can't be found on Google or easily found in textbooks."

Johannes Heidecke (OpenAI Head of Safety Systems): In June 2025, announced OpenAI expects upcoming models to hit "high-risk classification" for biological capabilities. Emphasized that "99% or even one in 100,000 performance is [not] sufficient" for testing accuracy.

Rocco Casagrande (Gryphon Scientific): After red-teaming Claude, said he was "personally surprised and dismayed by how capable current LLMs were" and that "these things are developing extremely, extremely fast."

Toby Ord (Oxford): Estimates engineered pandemic risk at 1 in 30 by 2100—second highest anthropogenic existential risk after AI itself.

Georgia Adamson and Gregory C. Allen (CSIS): Their August 2025 report warns current U.S. biosecurity measures are "ill-equipped" to meet AI-enabled challenges, with BDT safeguards "already circumventable post-deployment."

Bill Drexel and Caleb Withers (CNAS): Their August 2024 report warns AI could enable "catastrophic threats far exceeding the impact of COVID-19."

Those More Skeptical

RAND researchers (Mouton, Lucas, Guest): Their 2024 study found "no statistically significant difference" between AI-assisted and non-AI groups in bioweapon planning capability. This is the strongest empirical evidence against immediate AI uplift concerns.

Some biosecurity practitioners: Emphasize that the wet lab bottleneck—tacit knowledge, equipment access, technique—remains the primary barrier, and AI cannot transfer hands-on skills.

Information abundance argument: Dangerous information is already accessible through scientific literature and the internet. AI may provide convenience but not fundamentally new capabilities.

The Disagreement Structure

The debate often reduces to different assessments of:

Question	Higher Concern View	Lower Concern View
Current uplift	2025 lab evaluations show expert-level capabilities	RAND 2024 study is most rigorous empirical evidence
Future trajectory	OpenAI/Anthropic expect "high-risk" soon	May plateau; defenses improving
Key bottleneck	Knowledge gap narrowing fast	Wet lab skills remain rate-limiting
Guardrail effectiveness	DeepSeek shows open-source gaps	Frontier labs implementing robust safeguards
Risk tolerance	ASL-3 activation signals real concern	Base rates suggest low probability

2025 shift: The debate has evolved significantly. Both major frontier labs now officially acknowledge their next-generation models pose elevated biological risks. The question is shifting from "does AI provide uplift?" to "how much uplift, and can mitigations keep pace?"

Notably: Even those who downplay current uplift often acknowledge that future models may pose greater risks, and that defensive investments are worthwhile regardless.

Sources & Resources

2025-2026 Key Sources

Source	Type	Key Finding
Forecasting Research Institute	Expert survey	5x risk increase from AI; safeguards reduce risk to baseline
Council on Strategic Risks Year Review	Analysis	Rising awareness of AIxBio risks; UN governance bodies formed
Epoch AI Evaluation Analysis	Methodology review	Biorisk benchmarks saturated; Anthropic only lab with uplift trials
CSIS Policy Analysis	Policy	US biosecurity measures "ill-equipped" for AI threats
Anthropic Biorisk Methodology	Technical	ASL-3 activation justified; "substantially fewer critical failures"
OpenAI Biology Preparedness	Technical	Next-gen models expected to hit "high-risk" classification

Primary Research

RAND Corporation (2024): The Operational Risks of AI in Large-Scale Biological Attacks: Results of a Red-Team Study↗ - The most rigorous empirical study of AI uplift to date
Microsoft Research (2025): AI-designed toxins evade DNA screening - Published in Science, October 2025
National Academies (2025): The Age of AI in the Life Sciences: Benefits and Biosecurity Considerations - Comprehensive government-commissioned study on AI biosecurity risks
Gryphon Scientific (2023): Red-team evaluation of Claude's biological capabilities - Coverage in Semafor↗
UNICRI (2021): The Potential for Dual-Use of Protein-Folding Prediction↗ - Early analysis of AlphaFold biosecurity implications
Council on Strategic Risks (2023): The Cyber-Biosecurity Nexus↗

Emerging Technologies

International Governance

Arms Control Association: The Biological Weapons Convention (BWC) At A Glance↗
Arms Control Association (2024): Strengthening the Biological Weapons Convention↗
Bulletin of the Atomic Scientists (2024): How the Biological Weapons Convention could verify treaty compliance↗
Council on Strategic Risks (2025): Derailment of the Fifth Working Group of the BWC↗

Defensive Technologies

Nature (2018): Far-UVC light: A new tool to control the spread of airborne-mediated microbial diseases↗
Scientific Reports (2024): 222 nm far-UVC light markedly reduces infectious airborne virus in an occupied room↗
Lancet Microbe (2025): Inferring the sensitivity of wastewater metagenomic sequencing for virus detection↗
Virology Journal (2025): Revolutionizing immunization: a comprehensive review of mRNA vaccine technology↗

Historical Background

Wikipedia: Soviet biological weapons program↗
Wikipedia: Biopreparat↗
PMC (2023): The History of Anthrax Weaponization in the Soviet Union↗
Toby Ord: The Precipice: Existential Risk and the Future of Humanity (2020)

General Context

80,000 Hours: Problem profile: Preventing catastrophic pandemics↗
Bulletin of the Atomic Scientists (2024): Could AI help bioterrorists unleash a new pandemic?↗
Undark (2024): The Long, Contentious Battle to Regulate Gain-of-Function Work↗
Science (2025): NIH suspends dozens of pathogen studies over 'gain-of-function' concerns↗

Video & Audio

80,000 Hours Podcast: Kevin Esvelt on Biosecurity↗ - MIT researcher on biological risks and pandemic preparedness
Lex Fridman #431: Roman Yampolskiy↗ - Discusses AI safety including CBRN risks
Future of Life Institute: Podcast series↗ - Multiple episodes on biosecurity
RAND: The AI and Biological Weapons Threat↗ - Video briefing on the 2024 study

Analytical Models

Mouton, C., Lucas, C., & Guest, E. (2024). The Operational Risks of AI in Large-Scale Biological Attacks. RAND Corporation. Specific figures (12 teams, 80 hours, 7 weeks, 4 scenarios, 15 groups, n=12 completing) are drawn from this study as reported; independent verification of all sub-figures was not possible from sources available. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
The 75%, 23%, and 72% screening evasion figures are attributed to Microsoft researchers in multiple secondary account... — The 75%, 23%, and 72% screening evasion figures are attributed to Microsoft researchers in multiple secondary accounts; however, no primary paper URL was available for direct verification. These figures are reported "according to some sources" and should be treated as approximate pending primary-source confirmation. ↩ ↩²
Gryphon Scientific evaluation details (150+ hours, 20+ experts, Casagrande quotes) are reported via Semafor coverage ... — Gryphon Scientific evaluation details (150+ hours, 20+ experts, Casagrande quotes) are reported via Semafor coverage and Anthropic disclosures. Direct primary-source documentation was not available in the source cache; figures and quotes are presented as reported. ↩ ↩² ↩³ ↩⁴ ↩⁵
Anthropic. Responsible Scaling Policy. https://www.anthropic.com/news/anthropics-responsible-scaling-policy (policy document; specific claim about "at least 10 biorisk evaluations" is reported in secondary accounts and could not be independently verified from the primary document alone). ↩ ↩²
Anthropic's reported letter to the White House regarding Claude 3.7 Sonnet is cited in multiple news accounts (e.g., ... — Anthropic's reported letter to the White House regarding Claude 3.7 Sonnet is cited in multiple news accounts (e.g., Reuters, Politico, 2025); primary letter text was not publicly available at time of writing. ↩
OpenAI. Preparedness Framework (Beta). https://openai.com/safety/preparedness (definitions of "high" and "critical" capability thresholds drawn directly from this document). ↩ ↩² ↩³ ↩⁴
UK AI Safety Institute / <EntityLink id="E365" name="us-aisi">US AI Safety Institute</EntityLink>. Joint evaluation of Claude 3.5 Sonnet (2024). Details reported in government press releases and media coverage. ↩ ↩²
The Biological Weapons Convention was opened for signature in 1972 and entered into force in 1975. As of recent counts, it has approximately 187 states parties. See: United Nations Office for Disarmament Affairs, Biological Weapons Convention (https://www.un.org/disarmament/wmd/bio/). Specific party counts fluctuate; figures here reflect commonly cited totals. ↩ ↩² ↩³
U.S. Department of State, 2024 Adherence to and Compliance with Arms Control, Nonproliferation, and Disarmament Agreements and Commitments (https://www.state.gov/reports/2024-adherence-to-and-compliance-with-arms-control-nonproliferation-and-disarmament-agreements-and-commitments/). Claims about specific named countries reflect the content described in that report; readers should consult the primary document. ↩
Figures for Biopreparat's founding date, personnel count, and facility count derive primarily from accounts by former... — Figures for Biopreparat's founding date, personnel count, and facility count derive primarily from accounts by former program insiders, including Ken Alibek's memoir Biohazard (1999), and from academic analyses such as Milton Leitenberg and Raymond A. Zilinskas, The Soviet Biological Weapons Program: A History (Harvard University Press, 2012). Exact figures vary across sources and should be treated as estimates. ↩ ↩² ↩³ ↩⁴
Ken Alibek (with Stephen Handelman), *Biohazard: The Chilling True Story of the Largest Covert Biological Weapons Pro... — Ken Alibek (with Stephen Handelman), Biohazard: The Chilling True Story of the Largest Covert Biological Weapons Program in the World (Random House, 1999). Alibek's account is the primary public source for figures including smallpox production capacity; these claims have not been independently verified and should be understood as defector testimony. ↩ ↩²
The Sverdlovsk anthrax leak is documented in Matthew Meselson et al., "The Sverdlovsk Anthrax Outbreak of 1979," *Sci... — The Sverdlovsk anthrax leak is documented in Matthew Meselson et al., "The Sverdlovsk Anthrax Outbreak of 1979," Science 266(5188): 1202–1208 (1994) (https://doi.org/10.1126/science.7973702). The figure of at least 68 deaths comes from that study; the authors note that records were incomplete. Yeltsin's 1992 acknowledgment is widely reported in contemporaneous news coverage. ↩ ↩²
Pasechnik's 1989 defection and its significance are described in multiple accounts, including Tom Mangold and Jeff Go... — Pasechnik's 1989 defection and its significance are described in multiple accounts, including Tom Mangold and Jeff Goldberg, Plague Wars (Macmillan, 1999), and subsequent official British government statements. Specific claims about his briefings to Thatcher and Bush reflect these secondary accounts. ↩
The 1997 US concern about Iranian and other state recruitment of former Soviet bioweapons scientists is described in ... — The 1997 US concern about Iranian and other state recruitment of former Soviet bioweapons scientists is described in contemporaneous reporting, including coverage by The New York Times and government testimony. ↩
T.J. Torok et al., "A Large Community Outbreak of Salmonellosis Caused by Intentional Contamination of Restaurant Salad Bars," JAMA 278(5): 389–395 (1997) (https://doi.org/10.1001/jama.1997.03550050051033). The article documents 751 cases and 45 hospitalizations and confirms deliberate contamination. ↩ ↩²
Accounts of Aum Shinrikyo's biological program draw on: Richard Danzig et al., *Aum Shinrikyo: Insights into How Terr... — Accounts of Aum Shinrikyo's biological program draw on: Richard Danzig et al., Aum Shinrikyo: Insights into How Terrorists Develop Biological and Chemical Weapons (Center for a New American Security, 2012) (https://www.cnas.org/publications/reports/aum-shinrikyo-insights-into-how-terrorists-develop-biological-and-chemical-weapons). The $1 billion asset figure is widely cited but difficult to verify independently; it should be treated as an approximation from journalistic and government sources. ↩ ↩² ↩³
Citation rc-ae76 ↩
FBI summary of the Amerithrax investigation: Federal Bureau of Investigation, Amerithrax Investigative Summary (201... — FBI summary of the Amerithrax investigation: Federal Bureau of Investigation, Amerithrax Investigative Summary (2010) (https://www.justice.gov/archive/amerithrax/docs/amx-investigative-summary.pdf). The summary documents 5 deaths and 17 infections and identifies Bruce Ivins as the perpetrator. ↩ ↩²
The 40–70% pre-2024 screening effectiveness estimate has been cited in policy discussions and academic commentary on ... — The 40–70% pre-2024 screening effectiveness estimate has been cited in policy discussions and academic commentary on biosecurity chokepoints; it is not a single authoritative figure and should be understood as a rough range reflecting expert assessments at the time. See, e.g., discussions in the <EntityLink id="E423" name="johns-hopkins-center-for-health-security">Johns Hopkins Center for Health Security</EntityLink>'s work on nucleic acid synthesis governance. ↩
The claim that post-patch screening reaches approximately 97% effectiveness derives from reporting on Microsoft's bio... — The claim that post-patch screening reaches approximately 97% effectiveness derives from reporting on Microsoft's biosecurity research collaboration; readers should consult primary reporting (e.g., coverage in MIT Technology Review and STAT News) and treat this figure as preliminary pending further peer review. ↩

References

1Biological Weapons Conventionarmscontrol.org▸

An overview of the Biological Weapons Convention (BWC), the international treaty prohibiting the development, production, and stockpiling of biological weapons. The resource covers the treaty's history, membership, key provisions, and ongoing challenges in verification and enforcement. It serves as a reference for understanding the international legal framework governing biological weapons.

armscontrol.org

2RAND Corporation studyRAND Corporation·2024▸

This RAND Corporation research report examines the risk of AI systems providing meaningful uplift to actors seeking to develop biological weapons, focusing on how to assess capability thresholds and decompose the problem for evaluation purposes. It likely provides a framework for analyzing when AI crosses dangerous capability boundaries in the bioweapons domain and how to structure risk assessments accordingly.

★★★★☆

rand.org

3AlphaFold 3 predicts the structure and interactions of all of life’s moleculesGoogle AI·Google DeepMind AlphaFold team & Isomorphic Labs·2024▸

Google DeepMind and Isomorphic Labs introduce AlphaFold 3, an AI model that extends beyond protein structure prediction to model the structure and interactions of DNA, RNA, ligands, and other biological molecules. This represents a significant capability leap with broad implications for drug discovery and biological research. The dual-use nature of such powerful biomolecular modeling raises biosecurity concerns alongside its scientific benefits.

★★★★☆

blog.google

4Framework for Nucleic Acid Synthesis Screeningbidenwhitehouse.archives.gov·Government▸

The Biden White House Office of Science and Technology Policy (OSTP) released a framework establishing standards for screening nucleic acid synthesis orders to prevent misuse for biological weapons or pandemic-causing agents. The framework aims to create consistent biosecurity screening protocols across the DNA/RNA synthesis industry. It represents a key policy intervention at the intersection of biotechnology capabilities and biosecurity governance.

bidenwhitehouse.archives.gov

5The Cyber-Biosecurity Nexuscouncilonstrategicrisks.org▸

This Council on Strategic Risks briefer examines the intersection of cybersecurity and biosecurity, identifying how advances in automation, synthetic biology democratization, and proliferating high-containment facilities create novel threat vectors. It argues that hostile state actors like Russia and North Korea increasingly exploit vulnerabilities in biotech and medical research infrastructure as sub-threshold warfare tools, and offers policy recommendations for improved prevention, detection, and national response.

councilonstrategicrisks.org

6Policy for Oversight of Dual Use Research of Concern and Pathogens with Enhanced Pandemic Potentialaspr.hhs.gov·Government▸

This May 2024 U.S. federal policy establishes a unified oversight framework for dual use research of concern (DURC) and pathogens with enhanced pandemic potential (PEPPs), superseding earlier 2012/2014 DURC policies and the P3CO Framework. It defines two categories of regulated research, assigns responsibilities to principal investigators, institutions, and funding agencies, and creates risk assessment mechanisms for biological research that could threaten public health or national security. The policy took effect May 6, 2025.

aspr.hhs.gov

7Strengthening the Biological Weapons Conventionarmscontrol.org▸

This analysis examines the BWC working group established in 2022 to strengthen the treaty across seven areas including verification, compliance, and scientific developments. Midway through their four-year mandate, the group faces a fundamental tension between traditional legally binding multilateral models with mandatory inspections versus a flexible opt-in approach. The piece provides critical context on why decades of efforts to establish robust BWC compliance mechanisms have failed amid geopolitical tensions.

armscontrol.org

880,000 Hours: Toby Ord on The Precipice80,000 Hours▸

The 80,000 Hours Podcast hosts in-depth interviews with leading researchers and thinkers on AI safety, existential risk, effective altruism, and related high-impact topics. It covers technical AI safety, governance, alignment, superintelligence, AI deception, and emerging risks like AI-nuclear intersections. It serves as an accessible entry point and ongoing reference for the AI safety and EA communities.

★★★☆☆

80000hours.org

9Biopreparat: Soviet Biological Warfare AgencyWikipedia·Reference▸

Biopreparat was a covert Soviet agency (1974–1992) that ran the world's largest offensive biological weapons program, employing 30–40,000 personnel across ostensibly civilian research institutes and dual-use production facilities. It pursued genetically engineered pathogens resistant to antibiotics and developed strains with novel pathogenic properties, representing a landmark case study in state-sponsored biological weapons development and dual-use research risks.

★★★☆☆

en.wikipedia.org

10Benchtop DNA Synthesis Devices: Capabilities, Biosecurity Implications, and GovernanceNuclear Threat Initiative▸

This NTI report examines how emerging benchtop DNA synthesis devices threaten to decentralize DNA production, potentially circumventing existing biosecurity screening protocols maintained by centralized providers. It analyzes device capabilities, biosecurity risks from distributed access, and governance frameworks needed to maintain safety as the technology proliferates into individual laboratories.

★★★★☆

nti.org

11AI and the Evolution of Biological National Security RisksCNAS▸

This CNAS report examines how AI advancements intersect with biosecurity risks, analyzing threats from state actors, nonstate actors, and accidental releases. It assesses whether fears about AI-enabled bioweapons are warranted and provides actionable policy recommendations to mitigate catastrophic biological threats.

★★★★☆

cnas.org

12The History of Anthrax Weaponization in the Soviet UnionPubMed Central (peer-reviewed)·Ioannis Nikolakakis et al.·2023·Government▸

This historical paper examines the Soviet Union's anthrax weaponization program and its broader implications for biowarfare research and public health. The authors document how Soviet bioweapon development, particularly through the Biopreparat program, led to technological advances including the creation of the first Soviet anthrax vaccine and mass vaccination campaigns for animals and humans. The paper argues that while some biowarfare technologies were repurposed for civilian public health benefits, the legacy of Soviet bioweapons development continues to pose asymmetric threats to contemporary public health and security.

★★★★☆

pmc.ncbi.nlm.nih.gov

13Building an early warning system for LLM-aided biological threat creationOpenAI▸

OpenAI presents a methodology for evaluating whether LLMs like GPT-4 could meaningfully assist malicious actors in creating biological threats. In a controlled study with 100 participants (50 PhD biology experts, 50 students), they found GPT-4 provides at most mild uplift in biological threat creation accuracy compared to internet-baseline resources. The work is framed as a blueprint for empirical biosecurity evaluation and a potential 'tripwire' for future capability monitoring.

★★★★☆

openai.com

14Biosecurity resourcesNuclear Threat Initiative▸

The Nuclear Threat Initiative (NTI) biosecurity program addresses biological threats from natural, accidental, and intentional sources, including the intersection of AI and biotechnology. It advances global health security through policy advocacy, governance frameworks, and international coordination to prevent biological catastrophe.

★★★★☆

nti.org

15222 nm far-UVC light markedly reduces infectious airborne virus in an occupied roomNature (peer-reviewed)·Norman Kleiman et al.·2023·Paper▸

This study demonstrates that 222 nm far-UVC light effectively reduces infectious airborne viruses in occupied indoor spaces. Researchers installed four 222-nm light fixtures in a mouse-cage cleaning room and measured the reduction of aerosolized murine norovirus (MNV), a conservative surrogate for influenza and coronavirus. The far-UVC treatment achieved a 99.8% reduction in infectious airborne MNV while remaining within regulatory safety limits. This is the first direct demonstration of far-UVC efficacy against airborne pathogens in an actual occupied room, suggesting potential for controlling airborne-mediated disease transmission in real-world settings.

★★★★★

nature.com

16Blueprint BiosecurityBlueprint Biosecurity▸

Blueprint Biosecurity announces $1M in EXHALE program grants to two research teams evaluating far-UVC light's effectiveness against real human-generated respiratory aerosols containing influenza and SARS-CoV-2. The research aims to build the evidence base needed to deploy far-UVC in schools, hospitals, and public spaces as a pandemic countermeasure. Results are expected by mid-2026.

★★★★☆

blueprintbiosecurity.org

17RFI on far-UVC evaluationCoefficient Giving▸

Open Philanthropy's 2023 RFI soliciting expert input on far-UVC light technology (200–240 nm) as a promising disinfection approach for reducing airborne pathogen transmission in occupied spaces. The RFI sought insights on safety, efficacy, technological development, environmental impacts, and adoption strategies to inform potential grant-making in biosecurity and pandemic preparedness.

★★★★☆

openphilanthropy.org

18Roman Yampolskiy: Dangers of Superintelligent AI | Lex Fridman Podcast #431lexfridman.com▸

Lex Fridman interviews AI safety researcher Roman Yampolskiy about the existential risks of AGI and superintelligent AI, covering topics from AI controllability and deception to self-improving systems and verification challenges. Yampolskiy, author of 'AI: Unexplainable, Unpredictable, Uncontrollable,' argues that advanced AI poses fundamental control problems that current approaches cannot solve. The conversation spans AGI timelines, open-source AI debates, and the broader implications for humanity.

lexfridman.com

19Responsible Scaling PolicyAnthropic▸

Anthropic introduces its Responsible Scaling Policy (RSP), a framework of technical and organizational protocols for managing catastrophic risks as AI systems become more capable. The policy defines AI Safety Levels (ASL-1 through ASL-5+), modeled after biosafety level standards, requiring increasingly strict safety, security, and operational measures tied to a model's potential for catastrophic risk. Current Claude models are classified ASL-2, with ASL-3 and beyond triggering stricter deployment and security requirements.

★★★★☆

anthropic.com

20Soviet biological weapons programWikipedia·Reference▸

Documents the Soviet Union's covert and massive biological weapons program spanning from the 1920s to at least 1992, operated under the civilian cover organization Biopreparat in violation of the Biological Weapons Convention. The program developed and stockpiled weaponized pathogens including plague, smallpox, and anthrax for strategic, operational, and anti-agriculture use, representing the largest state bioweapons effort in history.

★★★☆☆

en.wikipedia.org

21Problem profile: Preventing catastrophic pandemics80,000 Hours▸

80,000 Hours' problem profile on catastrophic pandemic prevention, focusing primarily on engineered pandemics as an existential risk. It argues this is one of the world's most pressing problems due to advances in biotechnology that could enable the creation of pathogens far deadlier than natural ones, and outlines career paths and interventions to reduce this risk.

★★★☆☆

80000hours.org

22Kilobaser: Personal DNA & RNA Synthesizerkilobaser.com▸

Kilobaser offers a compact, microfluidic chip-based desktop DNA and RNA synthesizer ('Kilobaser one-Xt') that enables on-demand custom oligonucleotide synthesis in under two hours. It is marketed to life science labs as an affordable, independent alternative to commercial oligo ordering services. The product represents a democratization of DNA synthesis technology with potential dual-use biosecurity implications.

kilobaser.com

23Documenting Cloud Labs and Examining How Remotely Operated Automated Laboratories Could Enable Bad ActorsRAND Corporation·2025▸

This RAND paper surveys 15 cloud laboratory organizations worldwide—remotely operated, automated research facilities representing AI-biotech convergence—and analyzes how these platforms could be exploited by malicious actors to develop or proliferate chemical and biological weapons. The authors document facility details and discuss biosecurity vulnerabilities inherent to the cloud lab model, offering guidance for policymakers and stakeholders.

★★★★☆

rand.org

24DNA Script SYNTAX Systemdnascript.com▸

The SYNTAX System by DNA Script is a benchtop enzymatic DNA synthesis (EDS) platform that enables rapid, on-demand synthesis of custom oligonucleotides without traditional phosphoramidite chemistry. It democratizes access to DNA synthesis by allowing labs to produce custom DNA sequences in hours rather than days. This capability represents a dual-use biosecurity concern as it lowers barriers to synthesizing potentially dangerous genetic sequences.

dnascript.com

25Executive order blockedScience (peer-reviewed)·Lloyd S. Etheredge·1985·Paper▸

In response to a Trump executive order on gain-of-function (GOF) research oversight, the National Institutes of Health (NIH) has suspended dozens of federally-funded pathogen studies, with 40 projects immediately suspended and an additional 172 flagged for potential termination. The suspensions affect research on tuberculosis, influenza, COVID-19, and other pathogens conducted primarily at U.S. universities and some NIH in-house laboratories. While the agency is erring on the side of caution regarding potentially dangerous research, many infectious disease scientists have expressed puzzlement and dismay at the selections, particularly the large number of tuberculosis studies affected.

★★★★★

science.org

26Robust Biosecurity Measures Should Be Standardized at Scientific Cloud LabsRAND Corporation·2024▸

This RAND commentary examines the emerging risks posed by scientific cloud labs—remotely operated, highly automated laboratory facilities—and argues for standardized biosecurity measures including AI-based monitoring and industry consortiums to prevent misuse. The authors highlight how the accessibility and automation of cloud labs create dual-use risks alongside their scientific benefits.

★★★★☆

rand.org

27Bulletin of the Atomic Scientists arguesthebulletin.org▸

This Bulletin of the Atomic Scientists analysis examines the longstanding absence of verification mechanisms in the 1972 Biological Weapons Convention (BWC), explores why past efforts failed, and considers how advances in AI, genome editing, and biosurveillance technologies could enable new compliance verification approaches following the 2022 Ninth BWC Review Conference's renewed commitment to address these gaps.

thebulletin.org

28collaboration between SecureBio and MITnaobservatory.org▸

The Nucleic Acid Observatory (NAO), a collaboration between SecureBio and MIT's Sculpting Evolution group, develops pathogen-agnostic biosurveillance systems capable of detecting novel pandemic threats before widespread transmission. Their approach combines computational threat detection, large-scale monitoring evaluation (including wastewater surveillance), and real-world pilot deployments to build early warning infrastructure against biological catastrophes, including engineered pathogens.

naobservatory.org

29The AI and Biological Weapons ThreatRAND Corporation·2023▸

This RAND Corporation report examines the misuse risks of large language models (LLMs) in biological weapons development through a red-team methodology. Preliminary findings show that while LLMs haven't provided explicit weapon-creation instructions, they do offer guidance useful for planning biological attacks, including agent selection and acquisition strategies. The authors caution that AI's rapid advancement may outpace regulatory oversight, closing historical information gaps that previously hindered bioweapon development.

★★★★☆

rand.org

30Next-generation "trans-amplifying" mRNA vaccinesCEPI▸

CEPI (Coalition for Epidemic Preparedness Innovations) announces research into trans-amplifying mRNA (ta-mRNA) vaccines, a next-generation platform that could produce stronger immune responses at lower doses than conventional mRNA vaccines. This technology could enable faster, cheaper, and more scalable vaccine production for pandemic preparedness. The initiative represents a significant step in developing platform technologies for rapid response to emerging biological threats.

★★★★☆

cepi.net

31UNICRI: Dual-Use Research and Artificial Intelligenceunicri.org·Report▸

This UNICRI report examines the dual-use risks of AI-driven protein-folding prediction tools like AlphaFold, which offer major benefits for medicine but could be weaponized to engineer pathogens or target specific populations. It calls for governance frameworks that balance scientific openness with biosecurity safeguards.

unicri.org

32Our approach to biosecurity for AlphaFold 3storage.googleapis.com▸

DeepMind outlines the biosecurity measures and risk mitigation strategies implemented for AlphaFold 3, addressing concerns about dual-use potential of a powerful protein structure prediction system. The document explains how DeepMind assessed misuse risks and what safeguards were put in place before releasing the model, serving as a case study in responsible deployment of dual-use AI capabilities.

storage.googleapis.com

33SecureBio organizationsecurebio.org▸

SecureBio is an organization focused on reducing biological risks, particularly those arising from advances in biotechnology and AI-enabled capabilities. They conduct research and advocacy at the intersection of biosecurity and emerging technologies, including the risks posed by large language models and AI systems that could lower barriers to bioweapon development.

securebio.org

342024 Adherence To And Compliance With Arms Control Nonproliferation And Disarmament Agreements And Commitmentsstate.gov·Government▸

This U.S. State Department annual report assesses global compliance with arms control, nonproliferation, and disarmament agreements and commitments. It evaluates whether nations are adhering to treaties and obligations related to weapons of mass destruction, conventional arms, and related international frameworks. The report serves as an official U.S. government record of arms control compliance findings relevant to international security policy.

state.gov

35Pre-deployment evaluation of Claude 3.5 SonnetNIST·Government▸

The U.S. and UK AI Safety Institutes jointly conducted pre-deployment safety evaluations of Anthropic's upgraded Claude 3.5 Sonnet, testing biological capabilities, cyber capabilities, software/AI development, and safeguard efficacy. The evaluation used question answering, agent tasks, qualitative probing, and red teaming to benchmark the model against prior versions and competitors. This represents one of the first formal government-led pre-deployment AI safety evaluations made public.

★★★★★

nist.gov

36Could AI help bioterrorists unleash a new pandemic?thebulletin.org▸

This Bulletin of the Atomic Scientists article covers research examining whether current AI systems provide meaningful 'uplift' to would-be bioterrorists seeking to create or deploy pandemic pathogens. The study suggests that as of early 2024, AI does not yet provide substantial additional capability beyond what is already accessible, though the risk trajectory warrants continued monitoring.

thebulletin.org

37Oversight of Gain-of-Function Research with Pathogens: Issues for CongressUS Congress·Government▸

This CRS report provides a comprehensive analysis of federal oversight mechanisms for gain-of-function (GOF) research, which enhances pathogen transmissibility or virulence. It reviews existing frameworks like the Federal Select Agent Program and NIH guidelines, then outlines congressional policy options ranging from maintaining the status quo to banning GOF research entirely. The report is directly relevant to biosecurity governance and dual-use research of concern (DURC) policy.

★★★★★

congress.gov

38Far-UVC Light TechnologyWikipedia·Reference▸

Far-UVC (ultraviolet-C light at 207-222 nm wavelengths) is a disinfection technology that can inactivate pathogens including viruses and bacteria in occupied spaces without the harmful effects of conventional UV-C on human skin and eyes. It represents a potentially powerful tool for reducing airborne and surface transmission of infectious diseases. This Wikipedia article covers its physics, safety profile, efficacy, and current applications.

★★★☆☆

en.wikipedia.org

39Revolutionizing immunization: a comprehensive review of mRNA vaccine technologySpringer (peer-reviewed)·Kai Yuan Leong, Seng Kong Tham & Chit Laa Poh·2025▸

★★★★☆

link.springer.com

40Far-UVC light: A new tool to control the spread of airborne-mediated microbial diseasesNature (peer-reviewed)·David Welch et al.·2017·Paper▸

This study demonstrates that far-UVC light (207-222 nm) can efficiently inactivate airborne viruses and bacteria without harming human skin or eyes, unlike conventional UVC light. The researchers show that a low dose of 2 mJ/cm² of 222-nm light inactivates over 95% of aerosolized H1N1 influenza virus. The key advantage of far-UVC is its strong absorbance in biological materials prevents penetration of human skin and eye tissue, while its wavelength is still short enough to penetrate and inactivate micrometer-sized or smaller pathogens. The authors propose continuous low-dose far-UVC light in indoor public spaces as a safe, inexpensive tool to reduce airborne disease transmission.

★★★★★

nature.com

41AI-Assisted Bioterrorism Is Top Concern for OpenAI and Anthropicsemafor.com▸

A Semafor news article reporting on concerns from OpenAI and Anthropic that AI systems could assist malicious actors in developing bioweapons, drawing on findings from Gryphon Scientific's risk assessments. The piece highlights how frontier AI labs are prioritizing biosecurity as a critical safety concern in their red-teaming and deployment policies.

semafor.com

42FBI Amerithrax Investigative Summary (2010)justice.gov·Government▸

The FBI's official summary of the Amerithrax investigation into the 2001 anthrax letter attacks, concluding that Dr. Bruce E. Ivins, a USAMRIID biodefense researcher, was solely responsible. The case relied on novel microbial forensics, genetic analysis of anthrax spores, and circumstantial behavioral evidence. Ivins died by suicide in 2008 before charges were filed, leaving the case officially unsolved in court.

justice.gov

43Lancet Microbe publicationnaobservatory.org▸

This blog post from the Nucleic Acid Observatory (NAO) announces a publication in Lancet Microbe presenting their work on using metagenomic sequencing of environmental samples for early detection of biological threats and pandemic pathogens. The NAO framework proposes large-scale, continuous monitoring of nucleic acids in wastewater and other environmental sources to identify novel pathogens before outbreaks become uncontrollable. This represents a biosecurity-focused application of environmental surveillance with implications for both natural pandemic prevention and biodefense.

naobservatory.org

44Blueprint Biosecurity - Far-UVC Research InitiativeBlueprint Biosecurity▸

Blueprint Biosecurity is an organization focused on biosecurity research and policy, with a particular emphasis on Far-UVC light technology as a scalable intervention to reduce pandemic and biological threat risks. The initiative investigates Far-UVC as a potential tool for continuously disinfecting indoor air, potentially reducing transmission of airborne pathogens including engineered bioweapons.

★★★★☆

blueprintbiosecurity.org

45Security challenges by AI-assisted protein designembopress.org▸

This paper examines the dual-use risks emerging from AI-powered protein design tools, analyzing how advances in computational biology could be exploited to engineer harmful biological agents. It discusses the biosecurity implications of democratized access to protein engineering capabilities and calls for governance frameworks to mitigate misuse.

embopress.org

46SecureDNA – Biosecurity Screening for DNA Synthesissecuredna.org▸

SecureDNA is a Swiss nonprofit foundation developing cryptographic screening technology to prevent the synthesis of dangerous pathogens and bioweapons via DNA synthesis providers. It offers an open, privacy-preserving system that allows synthesis companies to check orders against a database of hazardous sequences without revealing the query content. The initiative aims to become a global biosecurity standard for responsible DNA synthesis.

securedna.org

47Preparedness FrameworkOpenAI▸

OpenAI's Preparedness Framework outlines a structured approach to evaluating and managing catastrophic risks from frontier AI models, including threats related to CBRN weapons, cyberattacks, and loss of human control. It defines risk severity thresholds and ties model deployment decisions to safety evaluations. The framework represents OpenAI's operational policy for responsible frontier model development.

★★★★☆

openai.com

48Derailment of the Fifth Working Group of the Biological and Toxin Weapons Convention - The Council on Strategic Riskscouncilonstrategicrisks.org▸

This article from the Council on Strategic Risks examines the failure or obstruction of the Fifth Working Group of the Biological and Toxin Weapons Convention (BTWC), a key multilateral mechanism for strengthening the bioweapons treaty. It analyzes the geopolitical and procedural dynamics that undermined the working group's progress, with implications for global biosecurity governance. The piece highlights the fragility of international cooperation on biological risk reduction.

councilonstrategicrisks.org

49Future of Life Institute: Existential Risk PodcastsFuture of Life Institute▸

The Future of Life Institute podcast series features conversations with leading researchers, policymakers, and thinkers on existential risks including AI safety, biosecurity, nuclear threats, and climate change. Episodes explore both technical and governance dimensions of catastrophic risk reduction. It serves as an accessible entry point for understanding the broad landscape of existential risk work.

★★★☆☆

futureoflife.org

50The Long, Contentious Battle to Regulate Gain-of-Function Workundark.org▸

This article traces the decades-long regulatory struggle over gain-of-function (GOF) research, examining how scientists, policymakers, and biosecurity experts have clashed over defining, overseeing, and limiting experiments that enhance pathogen transmissibility or lethality. It highlights the persistent gaps in federal oversight and the difficulty of establishing enforceable international norms for dual-use biological research.

undark.org

51Evonetix Evaleo Gene Synthesis Platformevonetix.com▸

Evonetix's Evaleo is a silicon chip-based DNA synthesis platform designed for high-fidelity, scalable gene synthesis. The technology aims to improve accuracy and throughput in synthetic biology applications. As a dual-use technology, it has implications for both beneficial biotech research and biosecurity risks.

evonetix.com

52Opportunities to Strengthen U.S. Biosecurity from AI-Enabled ... - CSISCSIS▸

★★★★☆

csis.org