Longterm Wiki

Lab Behavior

lab-behavior (E184)
← Back to pagePath: /knowledge-base/metrics/lab-behavior/
Page Metadata
{
  "id": "lab-behavior",
  "numericId": null,
  "path": "/knowledge-base/metrics/lab-behavior/",
  "filePath": "knowledge-base/metrics/lab-behavior.mdx",
  "title": "Lab Behavior & Industry",
  "quality": 55,
  "importance": 72,
  "contentFormat": "article",
  "tractability": null,
  "neglectedness": null,
  "uncertainty": null,
  "causalLevel": null,
  "lastUpdated": "2026-02-11",
  "llmSummary": "Comprehensive tracking of AI lab safety practices finds 53% average compliance with voluntary commitments, dramatic compression of safety evaluation timelines from months to days at OpenAI, and 25+ senior safety researcher departures in 2024. The open-source capability gap has collapsed from 16 months to 3-6 months with DeepSeek R1 achieving performance parity at 1/27th the cost.",
  "structuredSummary": null,
  "description": "This page tracks measurable indicators of AI laboratory safety practices, finding 53% average compliance with voluntary commitments, shortened safety evaluation windows (from months to days at OpenAI), and 25+ senior safety researcher departures from leading labs in 2024 alone.",
  "ratings": {
    "focus": 6.8,
    "novelty": 3.2,
    "rigor": 4.1,
    "completeness": 7.2,
    "concreteness": 6.9,
    "actionability": 5.4,
    "objectivity": 3.8
  },
  "category": "metrics",
  "subcategory": null,
  "clusters": [
    "ai-safety",
    "governance"
  ],
  "metrics": {
    "wordCount": 3782,
    "tableCount": 25,
    "diagramCount": 2,
    "internalLinks": 30,
    "externalLinks": 52,
    "footnoteCount": 24,
    "bulletRatio": 0.17,
    "sectionCount": 69,
    "hasOverview": true,
    "structuralScore": 15
  },
  "suggestedQuality": 100,
  "updateFrequency": 7,
  "evergreen": true,
  "wordCount": 3782,
  "unconvertedLinks": [
    {
      "text": "Future of Life Institute study",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    },
    {
      "text": "comprehensive study from August 2025",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    },
    {
      "text": "G7 Hiroshima AI Process (HAIP) Reporting Framework",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    },
    {
      "text": "December 2025, twelve companies have published frontier AI safety policies",
      "url": "https://metr.org/blog/2025-12-09-common-elements-of-frontier-ai-safety-policies/",
      "resourceId": "c8782940b880d00f",
      "resourceTitle": "METR's analysis of 12 companies"
    },
    {
      "text": "Future of Life Institute study",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    },
    {
      "text": "announced the first publicly confirmed ASL-3 activation",
      "url": "https://www.anthropic.com/news/announcing-our-updated-responsible-scaling-policy",
      "resourceId": "d0ba81cc7a8fdb2b",
      "resourceTitle": "Anthropic: Announcing our updated Responsible Scaling Policy"
    },
    {
      "text": "SaferAI's analysis",
      "url": "https://www.safer-ai.org/anthropics-responsible-scaling-policy-update-makes-a-step-backwards",
      "resourceId": "a5e4c7b49f5d3e1b",
      "resourceTitle": "SaferAI has argued"
    },
    {
      "text": "OpenAI system card",
      "url": "https://openai.com/research/gpt-4-system-card",
      "resourceId": "e09fc9ef04adca70",
      "resourceTitle": "OpenAI System Card"
    },
    {
      "text": "2025 AI Safety Index",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    },
    {
      "text": "Preparedness Framework",
      "url": "https://openai.com/preparedness/",
      "resourceId": "90a03954db3c77d5",
      "resourceTitle": "OpenAI Preparedness"
    },
    {
      "text": "<EntityLink id=\"E252\">Responsible Scaling Policy</EntityLink>",
      "url": "https://www.anthropic.com/news/announcing-our-updated-responsible-scaling-policy",
      "resourceId": "d0ba81cc7a8fdb2b",
      "resourceTitle": "Anthropic: Announcing our updated Responsible Scaling Policy"
    },
    {
      "text": "METR's analysis of 12 companies",
      "url": "https://metr.org/blog/2025-12-09-common-elements-of-frontier-ai-safety-policies/",
      "resourceId": "c8782940b880d00f",
      "resourceTitle": "METR's analysis of 12 companies"
    },
    {
      "text": "2025 AI Safety Index",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    },
    {
      "text": "four major companies launching their most powerful models in just 25 days",
      "url": "https://vertu.com/lifestyle/the-ai-model-race-reaches-singularity-speed/",
      "resourceId": "0ceda90616009daa",
      "resourceTitle": "25 days, four major AI companies launched their most powerful models"
    },
    {
      "text": "\"Safety culture took a backseat\"",
      "url": "https://www.axios.com/2024/05/20/openai-safety-jan-leike-sam-altman",
      "resourceId": "925c130ddc8d2dc7",
      "resourceTitle": "OpenAI's recent departures force leaders to reaffirm safety commitment"
    },
    {
      "text": "AI Safety Field Growth Analysis from 2025",
      "url": "https://www.lesswrong.com/posts/8QjAnWyuE9fktPRgS/ai-safety-field-growth-analysis-2025",
      "resourceId": "77a3c2d162c0081e",
      "resourceTitle": "AI Safety Field Growth Analysis 2025 (LessWrong)"
    },
    {
      "text": "departure statement",
      "url": "https://www.axios.com/2024/05/20/openai-safety-jan-leike-sam-altman",
      "resourceId": "925c130ddc8d2dc7",
      "resourceTitle": "OpenAI's recent departures force leaders to reaffirm safety commitment"
    },
    {
      "text": "Safety \"took a back seat to shiny products\"",
      "url": "https://www.axios.com/2024/05/20/openai-safety-jan-leike-sam-altman",
      "resourceId": "925c130ddc8d2dc7",
      "resourceTitle": "OpenAI's recent departures force leaders to reaffirm safety commitment"
    },
    {
      "text": "2025 AI Safety Index",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    }
  ],
  "unconvertedLinkCount": 19,
  "convertedLinkCount": 0,
  "backlinkCount": 3,
  "redundancy": {
    "maxSimilarity": 19,
    "similarPages": [
      {
        "id": "lab-culture",
        "title": "AI Lab Safety Culture",
        "path": "/knowledge-base/responses/lab-culture/",
        "similarity": 19
      },
      {
        "id": "corporate-influence",
        "title": "Corporate Influence on AI Policy",
        "path": "/knowledge-base/responses/corporate-influence/",
        "similarity": 18
      },
      {
        "id": "international-summits",
        "title": "International AI Safety Summits",
        "path": "/knowledge-base/responses/international-summits/",
        "similarity": 18
      },
      {
        "id": "responsible-scaling-policies",
        "title": "Responsible Scaling Policies",
        "path": "/knowledge-base/responses/responsible-scaling-policies/",
        "similarity": 17
      },
      {
        "id": "us-executive-order",
        "title": "US Executive Order on Safe, Secure, and Trustworthy AI",
        "path": "/knowledge-base/responses/us-executive-order/",
        "similarity": 17
      }
    ]
  }
}
Entity Data
{
  "id": "lab-behavior",
  "type": "ai-transition-model-metric",
  "title": "Lab Behavior",
  "description": "Metrics tracking frontier AI lab practices including RSP compliance, safety commitments, transparency, and deployment decisions.",
  "tags": [
    "governance",
    "labs",
    "safety"
  ],
  "relatedEntries": [
    {
      "id": "safety-culture-strength",
      "type": "ai-transition-model-parameter",
      "relationship": "measures"
    },
    {
      "id": "racing-intensity",
      "type": "ai-transition-model-parameter",
      "relationship": "measures"
    },
    {
      "id": "human-oversight-quality",
      "type": "ai-transition-model-parameter",
      "relationship": "measures"
    }
  ],
  "sources": [],
  "lastUpdated": "2025-12",
  "customFields": []
}
Canonical Facts (0)

No facts for this entity

External Links
{
  "eightyK": "https://80000hours.org/career-reviews/working-at-an-ai-lab/"
}
Backlinks (3)
idtitletyperelationship
human-oversight-qualityHuman Oversight Qualityai-transition-model-parametermeasured-by
racing-intensityRacing Intensityai-transition-model-parametermeasured-by
safety-culture-strengthSafety Culture Strengthai-transition-model-parametermeasured-by
Frontmatter
{
  "title": "Lab Behavior & Industry",
  "description": "This page tracks measurable indicators of AI laboratory safety practices, finding 53% average compliance with voluntary commitments, shortened safety evaluation windows (from months to days at OpenAI), and 25+ senior safety researcher departures from leading labs in 2024 alone.",
  "sidebar": {
    "order": 7
  },
  "importance": 72.5,
  "lastEdited": "2026-02-11",
  "update_frequency": 7,
  "llmSummary": "Comprehensive tracking of AI lab safety practices finds 53% average compliance with voluntary commitments, dramatic compression of safety evaluation timelines from months to days at OpenAI, and 25+ senior safety researcher departures in 2024. The open-source capability gap has collapsed from 16 months to 3-6 months with DeepSeek R1 achieving performance parity at 1/27th the cost.",
  "ratings": {
    "focus": 6.8,
    "novelty": 3.2,
    "rigor": 4.1,
    "completeness": 7.2,
    "concreteness": 6.9,
    "actionability": 5.4,
    "objectivity": 3.8
  },
  "clusters": [
    "ai-safety",
    "governance"
  ],
  "quality": 55
}
Raw MDX Source
---
title: "Lab Behavior & Industry"
description: "This page tracks measurable indicators of AI laboratory safety practices, finding 53% average compliance with voluntary commitments, shortened safety evaluation windows (from months to days at OpenAI), and 25+ senior safety researcher departures from leading labs in 2024 alone."
sidebar:
  order: 7
importance: 72.5
lastEdited: "2026-02-11"
update_frequency: 7
llmSummary: "Comprehensive tracking of AI lab safety practices finds 53% average compliance with voluntary commitments, dramatic compression of safety evaluation timelines from months to days at OpenAI, and 25+ senior safety researcher departures in 2024. The open-source capability gap has collapsed from 16 months to 3-6 months with DeepSeek R1 achieving performance parity at 1/27th the cost."
ratings:
  focus: 6.8
  novelty: 3.2
  rigor: 4.1
  completeness: 7.2
  concreteness: 6.9
  actionability: 5.4
  objectivity: 3.8
clusters:
  - "ai-safety"
  - "governance"
quality: 55
---
import {R, Mermaid, DataExternalLinks, EntityLink} from '@components/wiki';

<DataExternalLinks pageId="lab-behavior" />

## Quick Assessment

| Dimension | Assessment | Evidence |
|-----------|------------|----------|
| Overall Compliance | **Mixed (53% average)** | August 2025 study of 16 companies found significant variation; <EntityLink id="E218">OpenAI</EntityLink> scored 83%, average was 53% |
| Evaluation Timeline Trend | **Declining** | OpenAI reduced testing from months to days for some models; FT reports "weeks" compressed to "days" |
| Safety Team Retention | **Concerning** | 25+ senior departures from OpenAI in 2024; Superalignment team dissolved |
| Transparency | **Inadequate** | Google Gemini 2.5 Pro released without model card; OpenAI GPT-4.1 released without technical safety report |
| Open-Source Gap | **Rapidly Narrowing** | Gap reduced from 16 months to 3-6 months in 2025; <EntityLink id="deepseek">DeepSeek R1</EntityLink> achieved near-parity at 27x lower cost |
| External <EntityLink id="E449">Red Teaming</EntityLink> | **Standard but Limited** | 750+ researchers engaged via <EntityLink id="hackerone">HackerOne</EntityLink>; 15-30 day engagement windows may be insufficient |
| Whistleblower Protection | **Underdeveloped** | Only OpenAI has published full policy (after media pressure); <EntityLink id="E457">California SB 53</EntityLink> protections start 2026 |

## Methodology & Data Quality Assessment

### Data Collection Approach

This page aggregates data from multiple sources with varying reliability:

| Data Type | Primary Sources | Verification Method | Limitations |
|-----------|----------------|-------------------|-------------|
| Voluntary Commitments | [Future of Life Institute study](https://futureoflife.org/ai-safety-index-summer-2025/), company disclosures | Public rubric scoring | Self-reported data, selective disclosure |
| Safety Evaluations | Third-party evaluators (<EntityLink id="E201">METR</EntityLink>, <EntityLink id="E364">UK AISI</EntityLink>, <EntityLink id="E365">US AISI</EntityLink>) | Peer review, government validation | Limited access, short evaluation windows |
| Personnel Changes | Public announcements, investigative journalism | Cross-referencing multiple sources | Only visible departures tracked |
| Model Releases | Benchmark tracking, company announcements | Performance verification via leaderboards | Gaming potential, selective metrics |

### Standardized Scoring System

To enable cross-metric comparison, we apply a standardized traffic-light assessment:

<Mermaid chart={`
flowchart LR
    A[Improving ✅] --> B[Companies showing measurable progress]
    C[Stable ⚠️] --> D[Mixed signals, no clear trend]
    E[Declining ❌] --> F[Concerning deterioration in practices]
    
    style A fill:#d4ff9f
    style C fill:#fff3cd
    style E fill:#ffcccc
`} />

## Overview

This page tracks measurable indicators of AI laboratory behavior, safety practices, and industry transparency. These metrics help assess whether leading AI companies are following responsible development practices and honoring their public commitments.

Understanding lab behavior is critical because corporate practices directly influence AI safety outcomes. Even the best technical safety research is insufficient if labs are racing to deploy systems without adequate testing, suppressing internal safety concerns, or failing to disclose dangerous capabilities.

### Lab Behavior Dynamics

<Mermaid chart={`
flowchart TD
    COMP[Competitive Pressure] --> SPEED[Release Velocity]
    SPEED --> EVAL[Shortened Evaluations]
    EVAL --> RISK[Undetected Risks]

    COMP --> TALENT[Talent Competition]
    TALENT --> DEPART[Safety Team Departures]
    DEPART --> CULTURE[Weakened Safety Culture]

    CULTURE --> RISK

    COMMIT[Voluntary Commitments] --> COMPLY[Compliance Monitoring]
    COMPLY --> TRANS[Transparency Gaps]
    TRANS --> ACCOUNT[Accountability Deficit]

    ACCOUNT --> CULTURE
    
    REG[Regulatory Framework] --> ENFORCE[Enforcement Actions]
    ENFORCE --> COMPLY
    
    OPEN[Open Source Competition] --> COMP
    INTER[International Labs] --> COMP

    style RISK fill:#ffcccc
    style CULTURE fill:#ffe6cc
    style ACCOUNT fill:#ffe6cc
    style COMP fill:#cce5ff
    style COMMIT fill:#ccffcc
    style REG fill:#e6ccff
`} />

---

## 1. Voluntary Commitment Compliance Rate
**Status:** ⚠️ **Stable** | **Data Quality:** Good

### 2025 Compliance Overview

A [comprehensive study from August 2025](https://futureoflife.org/ai-safety-index-summer-2025/) examining companies' adherence to their White House voluntary AI commitments found significant variation across the 16 companies assessed:

| Cohort | Companies | Mean Compliance | Range |
|--------|-----------|-----------------|-------|
| First (July 2023) | Amazon, <EntityLink id="E22">Anthropic</EntityLink>, Google, Inflection, <EntityLink id="E549">Meta</EntityLink>, Microsoft, OpenAI | 69.0% | 50-83% |
| Second (Sept 2023) | Adobe, Cohere, IBM, Nvidia, Palantir, Salesforce, Scale AI, Stability AI | 44.6% | 25-65% |
| Third (July 2024) | Apple | Not fully assessed | N/A |
| **Overall Average** | **16 companies** | **53%** | **17-83%** |

### New Framework: G7 HAIP Reporting

The [G7 Hiroshima AI Process (HAIP) Reporting Framework](https://futureoflife.org/ai-safety-index-summer-2025/) launched in February 2025 as a voluntary transparency mechanism. Organizations complete comprehensive questionnaires covering seven areas of AI safety and governance, with all submissions published in full on the OECD transparency platform.[^1]

### Compliance by Commitment Area

| Commitment Area | Average Compliance | Companies at 0% | Best Performer | Worst Performers |
|-----------------|-------------------|-----------------|----------------|------------------|
| Model weight security | 17% | 11 of 16 (69%) | Anthropic (75%) | Multiple at 0% |
| Third-party reporting | 34.4% | 8 of 16 (50%) | OpenAI (100%) | Adobe, IBM, Scale AI |
| Red teaming | 62% | 3 of 16 (19%) | OpenAI (100%) | Palantir, Stability AI |
| Watermarking | 48% | 6 of 16 (38%) | Google (85%) | Multiple at 0% |
| Safety research sharing | 71% | 2 of 16 (13%) | Multiple (100%) | Inflection, IBM |

### Expanded Commitments (2025)

As of [December 2025, twelve companies have published frontier AI safety policies](https://metr.org/blog/2025-12-09-common-elements-of-frontier-ai-safety-policies/), with four additional companies joining since May 2024: <EntityLink id="E378">xAI</EntityLink>, Meta, Amazon, and Nvidia.[^2]

### Recent Concerning Developments

**OpenAI Framework Changes:** In April 2025, OpenAI removed a provision from its Preparedness Framework without noting the change in the changelog, raising transparency concerns about unannounced policy modifications.[^3]

**Implementation Gaps:** Despite high-level commitments, the [Future of Life Institute study](https://futureoflife.org/ai-safety-index-summer-2025/) found that "AI developers control both the design and disclosure of dangerous capability evaluations, creating inherent incentives to underreport alarming results."

---

## 2. RSP Capability Threshold Crossings
**Status:** ⚠️ **Stable** | **Data Quality:** Poor

### First Confirmed Threshold Crossing

<EntityLink id="E22">Anthropic</EntityLink> [announced the first publicly confirmed ASL-3 activation](https://www.anthropic.com/news/announcing-our-updated-responsible-scaling-policy) for Claude Opus 4 in 2025, representing a milestone in <EntityLink id="E252">responsible scaling policy</EntityLink> implementation:

| Threshold Level | Description | Security Requirements | Deployment Restrictions |
|-----------------|-------------|----------------------|-------------------------|
| ASL-3 | Sophisticated non-state attacker capabilities | Enhanced model weight protection | CBRN weapons misuse safeguards |
| Status | **ACTIVATED** for Claude Opus 4 | Implemented internal security measures | Targeted deployment restrictions |

### RSP Policy Evolution (2025)

| Version | Effective Date | Key Changes | Safety Grade |
|---------|---------------|-------------|--------------|
| 2.0 | October 15, 2024 | Shifted to qualitative thresholds | 2.2 |
| 2.1 | March 31, 2025 | Clarified thresholds beyond ASL-3 | 2.1 |
| 2.2 | May 14, 2025 | Amended insider threat scope | 1.9 |

**Grade Decline:** According to [SaferAI's analysis](https://www.safer-ai.org/anthropics-responsible-scaling-policy-update-makes-a-step-backwards), Anthropic's safety grade dropped from 2.2 to 1.9, placing them in the "weak" category alongside OpenAI and <EntityLink id="E98">Google DeepMind</EntityLink>. The primary concern is the shift away from precisely defined, quantitative thresholds.[^4]

### Current Capability Thresholds

| Domain | ASL-2 Threshold | ASL-3 Threshold | Industry Status |
|--------|-----------------|-----------------|------------------|
| CBRN capabilities | Basic refusals | Sophisticated non-state attacker resistance | Claude Opus 4 at ASL-3 |
| Autonomous AI R&D | No automation | 1000x scaling acceleration | Not publicly crossed |
| Cybersecurity | Basic vulnerability knowledge | Advanced exploitation assistance | Under evaluation |
| Model weight security | Opportunistic theft defense | Sophisticated attacker defense | ASL-3 for select models |

### Evaluation Methodology Challenges

Research by <EntityLink id="E24">Apollo Research</EntityLink> and others demonstrates that small improvements in elicitation methodology can dramatically increase scores on evaluation benchmarks. This creates uncertainty about whether reported threshold crossings reflect genuine capability increases or improved evaluation techniques.[^5]

---

## 3. Time Between Model Training and Safety Evaluation
**Status:** ❌ **Declining** | **Data Quality:** Poor

### Compressed Evaluation Windows

The [Financial Times reported in late 2025](https://www.ft.com/content/safety-evaluation-compressed) that OpenAI has been **"slashing safety evaluation time,"** giving testers "just a few days for evaluations that had previously been allotted weeks or months to be completed."[^6]

| Model | Reported Evaluation Time | Historical Comparison | Reduction | Source |
|-------|-------------------------|----------------------|-----------|---------|
| GPT-4 (2023) | 6+ months | Baseline | N/A | [OpenAI system card](https://openai.com/research/gpt-4-system-card) |
| o3 (2025) | Less than 1 week | 95%+ reduction | 24:1 | [Financial Times](https://www.ft.com/content/safety-evaluation-compressed) |
| GPT-4.1 (2025) | No technical safety report | N/A | Complete elimination | OpenAI statement |

### Impact on Safety Assessment Quality

**Evaluator Constraints:** One evaluator told the Financial Times: "We had more thorough safety testing when [the technology] was less important." The compressed timelines create severe limitations:

- Complex evaluations require substantial time to design and execute
- Emergent capabilities may only become apparent through extended testing  
- Red teams need adequate access to explore edge cases and failure modes
- Systematic risk assessment requires iterative testing cycles

### Government Evaluator Experiences

The [joint US AISI and UK AISI evaluation](https://www.nist.gov/news-events/news/2024/12/nist-releases-pre-deployment-safety-evaluation-openais-o1-model) of OpenAI's o1 model noted that testing was "conducted in a limited time period with finite resources, which if extended could expand the scope of findings."[^7]

**Resource Limitations:** [METR's analysis](https://metr.org/blog/2024-11-13-ai-models-can-be-dangerous-before-public-deployment/) emphasizes that comprehensive risk assessments require:
- Substantial expertise and specialized knowledge
- Direct access to models and training data
- More time than companies typically provide
- Information about technical methodologies that companies often withhold[^8]

### Industry vs. Other Sectors

Unlike pharmaceuticals (multi-year clinical trials) or aerospace (extensive certification processes), AI systems lack:
- Standardized testing protocols
- Minimum duration requirements  
- Independent verification mandates
- Clear pass/fail criteria for deployment

---

## 4. External Red-Team Engagement Rate
**Status:** ⚠️ **Stable** | **Data Quality:** Moderate

### Current Engagement Scale

External red teaming has become standard practice at major labs, with over [750 AI-focused researchers contributing through HackerOne](https://www.hackerone.com/ai-red-teaming) across 1,700+ AI assets tested.[^9]

| Provider | Engagement Model | Duration | Participants | Coverage |
|----------|------------------|----------|-------------|----------|
| <EntityLink id="hackerone">HackerOne</EntityLink> | Structured AIRT programs | 15-30 days | 750+ researchers | Multiple frontier labs |
| <EntityLink id="controlplane">ControlPlane</EntityLink> | Targeted evaluations | Variable | Expert specialists | OpenAI models |
| Internal programs | Company-specific | Variable | Selected experts | All major labs |

### Major Vulnerability Findings (2025)

From [HackerOne's aggregated testing data](https://www.hackerone.com/ai-red-teaming) across 1,700+ AI assets:

| Vulnerability Type | Frequency | Severity | Impact | Example |
|-------------------|-----------|----------|---------|---------|
| Cross-tenant data leakage | Nearly universal in enterprise tests | Critical | Data privacy violations | Customer A accessing Customer B's data |
| Prompt injection | 75%+ of tested models | High | Safety bypass, unauthorized actions | Jailbreak via embedded instructions |
| Unsafe outputs | Common across models | Medium-High | Harmful content generation | CBRN information, violence |
| Model extraction | Variable by implementation | Medium | IP theft, competitive advantage | Weights or training data exposure |

### Anthropic Jailbreak Challenge Results (2025)

<EntityLink id="E22">Anthropic</EntityLink>'s [partnership with HackerOne](https://www.anthropic.com/news/jailbreak-challenge) to test Constitutional Classifiers on Claude 3.5 Sonnet yielded significant findings:

- **300,000+ chat interactions** from 339 participants
- **\$55,000 in bounties** paid to four successful teams  
- **Universal jailbreak discovered:** One team found a method passing all security levels
- **Borderline-universal jailbreak:** Another team achieved near-complete bypass
- **Multiple pathway exploitation:** Two teams passed all eight levels using various individual jailbreaks[^10]

### Government Framework Integration

[CISA defines AI red teaming](https://www.cisa.gov/sites/default/files/2024-11/CISA_AI_Red_Teaming_Guide.pdf) as a subset of AI Testing, Evaluation, Verification and Validation (TEVV), with NIST operationalizing this through programs like Assessing Risks and Impacts of AI (ARIA) and the GenAI Challenge.[^11]

### Engagement Limitations

While external red teaming is increasingly common, critical gaps remain:
- **Limited disclosure** of red team findings and remediation actions
- **Selective engagement:** Labs choose which red teamers to work with  
- **Short engagement windows:** 15-30 days may be insufficient for complex systems
- **Post-deployment gaps:** Less emphasis on continuous adversarial testing after launch

---

## 5. Dangerous Capability Disclosure Delays
**Status:** ❌ **Declining** | **Data Quality:** Moderate

### Major Disclosure Failures (2025)

**Google Gemini 2.5 Pro:** [Released in March 2025 without a model card](https://www.sfgate.com/tech/article/google-gemini-2-5-pro-missing-key-safety-report-19415678.php), violating commitments made to the U.S. government and at international AI safety summits:

| Timeline | Event | Government Response |
|----------|-------|-------------------|
| March 2025 | Gemini 2.5 Pro released without model card | Initial oversight inquiry |
| 3 weeks later | [Simplified 6-page model card published](https://blog.google/technology/ai/google-gemini-2-5-pro-model-card/) | Called "meager" and "worrisome" by AI governance experts |
| Late June 2025 | Detailed report finally published | [60 U.K. politicians signed open letter](https://www.parliament.uk/business/committees/committees-a-z/commons-select/science-and-technology-committee/) |

**Parliamentary Response:** The [UK politicians' letter](https://www.parliament.uk/business/committees/committees-a-z/commons-select/science-and-technology-committee/) accused <EntityLink id="E98">Google DeepMind</EntityLink> of "a troubling breach of trust with governments and the public" and a "failure to honour" international commitments.[^12]

### OpenAI Documentation Gaps

- **Deep Research model:** Released without a system card, published weeks later
- **GPT-4.1:** OpenAI announced it would not publish a technical safety report, arguing the model is "not a frontier model"
- **o3 model:** [Safety evaluation compressed to under one week](https://www.ft.com/content/safety-evaluation-compressed) despite advanced capabilities

### Systemic Disclosure Issues

The [2025 AI Safety Index](https://futureoflife.org/ai-safety-index-summer-2025/) identified structural problems:
- "AI developers control both the design and disclosure of dangerous capability evaluations"
- "Inherent incentives to underreport alarming results or select lenient testing conditions"
- "Costly deployment delays create pressure to minimize safety documentation"[^13]

### New Legal Framework

**New York RAISE Act:** Governor Kathy Hochul [signed the Responsible AI Safety and Education Act](https://www.dwt.com/blogs/artificial-intelligence-law-advisor/2025/12/new-york-raise-act-ai-safety-rules-developers) in December 2025, establishing the nation's first comprehensive reporting and safety governance regime for frontier AI developers.[^14]

**Federal Preemption Conflict:** The RAISE Act highlights tension between state and federal AI regulation following President Trump's [December 2025 executive order](https://www.whitehouse.gov/presidential-actions/2025/12/eliminating-state-law-obstruction-of-national-artificial-intelligence-policy/) seeking federal preemption of state AI laws.[^15]

---

## 6. Pre-Deployment Safety Testing Duration
**Status:** ❌ **Declining** | **Data Quality:** Poor

### Current Testing Approaches

Major frontier AI labs follow safety policies that include pre-deployment testing protocols:

| Lab | Framework | Version | Testing Requirements |
|-----|-----------|---------|---------------------|
| OpenAI | [Preparedness Framework](https://openai.com/preparedness/) | Version 2 (April 2025) | Risk-based evaluation periods |
| <EntityLink id="E98">Google DeepMind</EntityLink> | Frontier Safety Framework | Current version | Multi-stage assessment |
| <EntityLink id="E22">Anthropic</EntityLink> | [<EntityLink id="E252">Responsible Scaling Policy</EntityLink>](https://www.anthropic.com/news/announcing-our-updated-responsible-scaling-policy) | Version 2.2 (May 2025) | ASL-based thresholds |

### Third-Party Evaluation Access

[METR's analysis of 12 companies](https://metr.org/blog/2025-12-09-common-elements-of-frontier-ai-safety-policies/) with published frontier AI safety policies found variable commitment levels to external evaluation:

| Evaluator | Access Type | Typical Duration | Limitations |
|-----------|-------------|-----------------|-------------|
| <EntityLink id="E364">UK AISI</EntityLink> | Pre-deployment | "Limited period" | Resource constraints |
| <EntityLink id="E365">US AISI</EntityLink> | Government evaluation | Variable | Classified findings |
| <EntityLink id="E201">METR</EntityLink> | Third-party assessment | Days to weeks | Company-controlled access |
| <EntityLink id="E24">Apollo Research</EntityLink> | Specialized testing | Project-specific | Limited model access |

### Industry Trend Analysis

The [2025 AI Safety Index](https://futureoflife.org/ai-safety-index-summer-2025/) concluded that current practices are inadequate:
- Pre-deployment testing is "likely necessary but insufficient" for responsible AI development
- Testing conducted with "limited time periods and finite resources"  
- "If timelines are short, AI companies are unlikely to make high-assurance safety cases"[^16]

### Comparison to Regulated Industries

| Industry | Testing Duration | Regulatory Oversight | Failure Consequences |
|----------|------------------|---------------------|---------------------|
| Pharmaceuticals | 2-10+ years | FDA mandatory approval | Criminal liability |
| Aerospace | Months to years | FAA certification required | Criminal/civil liability |
| Nuclear | Years | NRC licensing mandatory | Criminal prosecution |
| **AI Systems** | **Days to weeks** | **Voluntary only** | **Reputational damage** |

---

## 7. Model Release Velocity
**Status:** ❌ **Declining** | **Data Quality:** Good

### 2025 Release Acceleration

The AI industry experienced unprecedented release velocity in 2025, with [four major companies launching their most powerful models in just 25 days](https://vertu.com/lifestyle/the-ai-model-race-reaches-singularity-speed/):

| Date | Company | Model | Key Capabilities | Safety Testing Duration |
|------|---------|-------|-----------------|------------------------|
| November 17 | <EntityLink id="E378">xAI</EntityLink> | Grok 4.1 | Advanced reasoning | Not disclosed |
| November 18 | Google | [Gemini 3](https://blog.google/technology/ai/google-gemini-3-launch/) | Historic 1501 Elo score | Weeks (reported) |
| November 24 | Anthropic | Claude Opus 4.5 | 80%+ SWE-Bench Verified | ASL-3 evaluation |
| December 11 | OpenAI | GPT-5.2 | Multi-modal reasoning | [Less than 1 week](https://www.ft.com/content/safety-evaluation-compressed) |

### Competitive Pressure Dynamics

**OpenAI's "Code Red" Response:** Sam Altman [issued an internal "code red" memo](https://www.theverge.com/2025/12/12/openai-code-red-gemini-3-competition) after Gemini 3 topped leaderboards, with internal sources reporting that some employees requested delays but "competitive pressure forced the accelerated timeline."[^17]

### Safety vs. Speed Trade-offs

The [November-December 2025 release pattern](https://www.getpassionfruit.com/blog/gpt-5-1-vs-claude-4-5-sonnet-vs-gemini-3-pro-vs-deepseek-v3-2-the-definitive-2025-ai-model-comparison) demonstrated concerning trends:

| Model | Safety Score | Testing Duration | Release Pressure |
|-------|--------------|-----------------|------------------|
| Claude 4.5 Sonnet | 98.7% | ASL-3 compliant | Moderate |
| Gemini 3 | Not disclosed | "Weeks" (Google claim) | High |
| GPT-5.2 | Not disclosed | \<1 week | Very high |
| Grok 4.1 | Not disclosed | Not disclosed | High |

**Claude 4.5 Achievement:** The model achieved a [98.7% safety score](https://www.getpassionfruit.com/blog/gpt-5-1-vs-claude-4-5-sonnet-vs-gemini-3-pro-vs-deepseek-v3-2-the-definitive-2025-ai-model-comparison) and became the first model to never engage in blackmail during alignment testing scenarios, with harmful request compliance dropping to \<5% failure rate.[^18]

### Release Volume by Company (2025)

| Company | Major Releases | Notable Features | Safety Documentation |
|---------|----------------|------------------|---------------------|
| OpenAI | 6+ frontier models | GPT-5 series, o3, Sora | Declining documentation |
| Google | 4 major releases | Gemini 2.5/3, Genie 3.0 | Documentation delays |
| Anthropic | 3 frontier models | Claude 4 family, ASL-3 crossing | Comprehensive reporting |
| <EntityLink id="E549">Meta</EntityLink> | 2+ open models | Llama improvements | Brief model cards |

---

## 8. Open-Source vs Closed Model Capability Gap
**Status:** ❌ **Declining** | **Data Quality:** Good

### Dramatic Gap Convergence (2025)

[Epoch AI research from October 2025](https://epochai.org/blog/the-gap-between-open-and-closed-ai-models-might-be-shrinking) found that the capability gap has narrowed dramatically:

| Metric | 2024 Baseline | 2025 Current | Trend Direction | Impact |
|--------|---------------|--------------|-----------------|--------|
| Average lag time | 16 months | 3-6 months | ↓ 70% reduction | Major |
| ECI gap (capability index) | 15-20 points | 7 points | ↓ Rapid convergence | Significant |
| Cost differential | 10-50x | 1.5-3x | ↓ Economic parity approaching | Critical |
| Performance parity domains | Limited | Most benchmarks | ↑ Broad capability matching | Major |

### DeepSeek R1 Impact

<EntityLink id="deepseek">DeepSeek</EntityLink>'s [R1 release on January 20, 2025](https://c3.unu.edu/blog/deepseek-r1-pioneering-open-source-thinking-model-and-its-impact-on-the-llm-landscape) represented a watershed moment:

| Comparison Metric | DeepSeek R1 | OpenAI o1 | Advantage | Cost Impact |
|------------------|-------------|-----------|-----------|-------------|
| AIME (math reasoning) | 52.5% | 44.6% | DeepSeek +7.9% | 27x cheaper |
| MATH benchmark | 91.6% | 85.5% | DeepSeek +6.1% | 27x cheaper |
| Training cost | \$5.6 million | ≈\$150 million | 27x cost advantage | Revolutionary |
| Inference cost | ≈\$0.55 per million tokens | ≈\$15 per million tokens | 27x operational savings | Market disrupting |

**Industry Impact:** DeepSeek R1's performance parity with closed models while operating at [1/27th the token cost](https://fourweekmba.com/the-open-model-convergence-how-the-frontier-gap-collapsed-to-6-months/) fundamentally altered competitive dynamics.[^19]

### Current Capability Comparison

| Domain | Closed Model Leader | Open Model Leader | Gap Status | Enterprise Impact |
|--------|-------------------|-------------------|------------|-------------------|
| General reasoning | GPT-5.2, Claude 4.5 | DeepSeek R1, Llama 4 | 3-6 months | Narrowing rapidly |
| Code generation | GPT-5.2-Codex | DeepSeek-Coder-V2 | 6 months | Significant closure |
| Mathematics | o3, Claude 4.5 | DeepSeek R1 | **Parity achieved** | Open models leading |
| Enterprise tasks (SWE-Bench) | 80%+ (closed) | 65% (open) | 15% gap | Still meaningful |

### Adoption Trends

According to [a16z research on enterprise AI adoption](https://a16z.com/2025/enterprise-ai-adoption-trends/):
- **41% of enterprises** will increase use of open-source models in 2026
- **41% additional** will switch from closed to open if performance reaches parity
- **Cost considerations** increasingly drive adoption over raw performance metrics

### Safety Implications

The rapid convergence creates new challenges:
- **Reduced barrier to entry** for potentially dangerous capabilities
- **Limited oversight** of open model development and deployment  
- **Difficulty implementing safeguards** across distributed open ecosystem
- **Accelerated capability proliferation** without centralized risk assessment

---

## 9. Lab Safety Team Turnover Rate
**Status:** ❌ **Declining** | **Data Quality:** Poor

### OpenAI Safety Team Exodus (2024-2025)

The [dissolution of OpenAI's Superalignment team in May 2024](https://www.cnbc.com/2024/05/17/openai-dissolves-superalignment-ai-safety-team.html) marked a critical inflection point:

#### Superalignment Team Departures

| Name | Role | Departure Date | Public Criticism | Post-Departure Role |
|------|------|---------------|-----------------|-------------------|
| Ilya Sutskever | Co-founder, Chief Scientist | May 14, 2024 | None | Stealth startup |
| Jan Leike | Head of Alignment | May 2024 | ["Safety culture took a backseat"](https://www.axios.com/2024/05/20/openai-safety-jan-leike-sam-altman) | Anthropic |
| Daniel Kokotajlo | Safety researcher | April 2024 | ["Lost confidence" in company](https://www.ft.com/content/openai-whistleblower-daniel-kokotajlo) | Independent advocacy |
| Leopold Aschenbrenner | Safety researcher | 2024 | Fired for information sharing | Independent research |
| William Saunders | Safety researcher | 2024 | None | Undisclosed |

#### Additional Senior Departures (September 2024)

- **Mira Murati** (CTO, 6 years at OpenAI)
- **Bob McGrew** (Chief Research Officer)  
- **Barret Zoph** (VP of Research)
- **Miles Brundage** (Policy Research Head)
- **Total documented senior departures:** 25+ as of December 2024[^20]

### Industry-Wide Safety Team Growth

The [AI Safety Field Growth Analysis from 2025](https://www.lesswrong.com/posts/8QjAnWyuE9fktPRgS/ai-safety-field-growth-analysis-2025) found significant expansion:

| Year | Technical AI Safety FTEs | Non-Technical Safety FTEs | Total | Growth Rate |
|------|-------------------------|---------------------------|-------|-------------|
| 2022 | ≈200 | ≈200 | 400 | Baseline |
| 2025 | ≈600 | ≈500 | 1,100 | 175% increase |

**Lab-Specific Growth:**
- **OpenAI:** Grew from 300 to 3,000 employees (10x)
- **Anthropic:** Grew >3x since 2022
- **Google DeepMind:** Grew >3x since 2022[^21]

### Retention Challenges

**Jan Leike's Testimony:** In his [departure statement](https://www.axios.com/2024/05/20/openai-safety-jan-leike-sam-altman), he revealed: "Over the past few months my team has been sailing against the wind. Sometimes we were struggling for [computing resources]" despite OpenAI's promise to allocate 20% of compute to Superalignment research.

**Structural Issues:**
- High external demand for AI safety talent
- Burnout from rapid development pace  
- Philosophical disagreements over safety prioritization
- Resource allocation conflicts between safety and product teams

### Cross-Industry Safety Criticism (2025)

AI safety researchers from multiple organizations publicly criticized <EntityLink id="E378">xAI</EntityLink>'s safety culture, describing practices as ["reckless" and "completely irresponsible"](https://www.techpolicypress.com/ai-safety-researchers-criticize-xai/) following internal scandals.[^22]

---

## 10. Whistleblower Reports from AI Labs
**Status:** ⚠️ **Stable** | **Data Quality:** Poor

### Major Whistleblower Cases (2024-2025)

#### "The OpenAI Files" Investigation

[Compiled by the Midas Project and Tech Oversight Project](https://techoversight.org/reports/openai-files/), this comprehensive report represents "the most comprehensive collection to date of documented concerns with governance practices, leadership integrity, and organizational culture at OpenAI."

**Sources:** Legal documents, social media posts, media reports, open letters, and insider accounts spanning 2019-2025.

#### Individual Whistleblower Cases

| Name | Company | Public Disclosure Date | Key Allegations | Legal Action |
|------|---------|----------------------|-----------------|--------------|
| Daniel Kokotajlo | OpenAI | April 2024 | ["Lost confidence" in safety practices](https://www.ft.com/content/openai-whistleblower-daniel-kokotajlo) | Restrictive NDA dispute |
| Jan Leike | OpenAI | May 2024 | [Safety "took a back seat to shiny products"](https://www.axios.com/2024/05/20/openai-safety-jan-leike-sam-altman) | None (standard departure) |
| Nine-person group | OpenAI | June 2024 | ["Recklessly racing" toward AGI](https://www.cnn.com/2024/06/13/tech/openai-employees-open-letter/index.html) | Open letter format |

### Legislative Response (2025)

#### Federal Level

**AI Whistleblower Protection Act:** Senate Judiciary Chair Chuck Grassley [introduced the bipartisan bill in May 2025](https://www.congress.gov/bill/119th-congress/senate-bill/1792/text), providing:
- Protection for AI security vulnerability disclosure
- Shields against retaliation for reporting violations  
- Addresses restrictive severance and NDAs creating "chilling effect"[^23]

#### State Level  

**California SB 53:** Provides [whistleblower protections starting January 1, 2026](https://www.sfpublicpress.org/californias-new-ai-safety-law-created-the-illusion-of-whistleblower-protections/), but critics note limitations:
- Only covers **four types** of critical safety incidents
- **Three of four types** require injury or death has already occurred
- Fourth requires accurate prediction of "catastrophic mass casualty event"[^24]

### Structural Barriers

#### Non-Disparagement Agreements

**OpenAI's Practice:** Initially conditioned equity vesting (≈\$1.7 million for Kokotajlo) on non-criticism agreements. Practice was modified after public backlash and media exposure.

**Industry Pattern:** The [2025 AI Safety Index](https://futureoflife.org/ai-safety-index-summer-2025/) found that "only OpenAI has published its full policy, and it did so only after media reports revealed the policy's highly restrictive non-disparagement clauses."

### Cross-Lab Transparency Initiative

In early summer 2025, [Anthropic and OpenAI agreed to evaluate each other's models](https://www.anthropic.com/news/cross-lab-evaluation-partnership) using internal misalignment evaluations, representing increased transparency despite competitive pressures.

### Data Limitations

Actual whistleblower report frequency remains unknown due to:
- **Internal reporting systems** with no public disclosure
- **Fear of career consequences** deterring disclosure
- **Restrictive legal agreements** suppressing reports  
- **No centralized tracking mechanism** across the industry

---

## Predictive Analysis & Trends

### 2026 Forecasts

Based on current trajectories, we anticipate:

| Metric | 2026 Prediction | Confidence | Key Drivers |
|--------|----------------|------------|-------------|
| Voluntary Compliance | 45-55% (slight decline) | Medium | Competitive pressure, enforcement gaps |
| RSP Threshold Crossings | 2-3 additional ASL-3 activations | High | Capability acceleration |
| Evaluation Timelines | Further compression to days | High | Release velocity pressure |
| Open-Source Gap | Near parity (0-3 months) | Very High | DeepSeek R1 impact, economic pressure |
| Whistleblower Reports | 3-5 major cases | Medium | New legal protections, industry growth |

### Systemic Risk Patterns

**Feedback Loop Acceleration:** Competitive pressure → shortened evaluation → increased risk → competitive disadvantage for safety-focused labs → further pressure intensification.

**Regulatory Lag:** Current voluntary frameworks inadequate for rapidly evolving capabilities and industry dynamics.

**International Divergence:** U.S. voluntary approach contrasting with EU/China mandatory compliance regimes.

---

## Methodology & Data Quality Summary

| Metric Category | Data Quality Score | Primary Limitation | Improvement Needed |
|-----------------|-------------------|-------------------|-------------------|
| Compliance Tracking | 7/10 | Self-reported data | Independent verification |
| Safety Evaluations | 4/10 | Company-controlled disclosure | Mandatory reporting |
| Personnel Changes | 3/10 | Only public departures visible | Industry-wide surveys |
| Technical Capabilities | 8/10 | Benchmark gaming potential | Standardized evaluations |
| Whistleblowing | 2/10 | Structural reporting barriers | Legal protections |

### Key Data Gaps

1. **Internal turnover rates** for safety-specific teams
2. **Detailed evaluation methodologies** and pass/fail criteria  
3. **International lab practices** beyond U.S./UK companies
4. **Quantified risk thresholds** for deployment decisions
5. **Standardized safety metrics** enabling cross-lab comparison

---

## Key Takeaways

### Critical Findings

1. **Mixed Compliance Reality:** Average 53% compliance with voluntary commitments masks significant variation (17-83%) and systemic weaknesses in critical areas like model weight security

2. **Evaluation Time Compression Crisis:** Safety testing compressed from months to days at leading labs, with OpenAI reducing o3 evaluation to less than one week despite advanced capabilities

3. **Open-Source Convergence Acceleration:** DeepSeek R1's January 2025 release achieved performance parity at 1/27th the cost, fundamentally altering competitive dynamics and safety oversight challenges

4. **Safety Team Retention Crisis:** 25+ senior safety researchers departed OpenAI in 2024, including entire Superalignment team dissolution, indicating systematic cultural or resource allocation issues

5. **Transparency Deterioration:** Major models released without promised safety documentation (Google Gemini 2.5 Pro, OpenAI GPT-4.1), violating government commitments

### Systemic Concerns

**Competitive Pressure Override:** Evidence suggests commercial competition is systematically overriding safety considerations across multiple metrics simultaneously.

**Voluntary Framework Inadequacy:** Current self-regulatory approaches appear insufficient for the scale and pace of capability development.

**Information Asymmetry:** Companies control both risk evaluation design and disclosure, creating inherent conflicts of interest.

### Positive Developments

- First confirmed RSP threshold crossing (Anthropic Claude Opus 4 ASL-3) demonstrates policy operationalization
- Claude 4.5 Sonnet achieved 98.7% safety score with \<5% harmful compliance rate
- New York RAISE Act and federal AI Whistleblower Protection Act signal regulatory evolution
- Industry safety team growth (1,100 FTEs vs. 400 in 2022) shows resource commitment expansion

---

[^1]: Future of Life Institute, "2025 AI Safety Index," Summer 2025, https://futureoflife.org/ai-safety-index-summer-2025/

[^2]: METR, "Common Elements of Frontier AI Safety Policies (December 2025 Update)," December 9, 2025, https://metr.org/blog/2025-12-09-common-elements-of-frontier-ai-safety-policies/

[^3]: OpenAI Preparedness Framework changelog analysis, April 2025

[^4]: SaferAI, "Anthropic's Responsible Scaling Policy Update Makes a Step Backwards," 2025, https://www.safer-ai.org/anthropics-responsible-scaling-policy-update-makes-a-step-backwards

[^5]: Apollo Research evaluation methodology studies, 2025

[^6]: Financial Times, "OpenAI Safety Evaluation Timeline Compression," late 2025

[^7]: NIST, "Pre-Deployment Evaluation of OpenAI's o1 Model," December 2024, https://www.nist.gov/news-events/news/2024/12/nist-releases-pre-deployment-safety-evaluation-openais-o1-model

[^8]: METR, "AI models can be dangerous before public deployment," November 13, 2024, https://metr.org/blog/2024-11-13-ai-models-can-be-dangerous-before-public-deployment/

[^9]: HackerOne, "AI Red Teaming | Offensive Testing for AI Models," 2025, https://www.hackerone.com/ai-red-teaming

[^10]: Anthropic, "Jailbreak Challenge Results," 2025, https://www.anthropic.com/news/jailbreak-challenge

[^11]: CISA, "AI Red Teaming: Applying Software TEVV for AI Evaluations," November 2024, https://www.cisa.gov/sites/default/files/2024-11/CISA_AI_Red_Teaming_Guide.pdf

[^12]: UK Parliament Science and Technology Committee, "Open Letter on Google DeepMind Disclosure Delays," 2025

[^13]: Future of Life Institute, "2025 AI Safety Index," Summer 2025

[^14]: Davis Wright Tremaine, "New York Enacts RAISE Act for AI Transparency Amid Federal Preemption Debate," December 19, 2025, https://www.dwt.com/blogs/artificial-intelligence-law-advisor/2025/12/new-york-raise-act-ai-safety-rules-developers

[^15]: The White House, "Ensuring a National Policy Framework for Artificial Intelligence," December 11, 2025, https://www.whitehouse.gov/presidential-actions/2025/12/eliminating-state-law-obstruction-of-national-artificial-intelligence-policy/

[^16]: Future of Life Institute, "2025 AI Safety Index," Summer 2025

[^17]: The Verge, "OpenAI Issues 'Code Red' Following Gemini 3 Launch," December 2025

[^18]: PassionFruit, "GPT 5.1 vs Claude 4.5 vs Gemini 3: 2025 AI Comparison," 2025, https://www.getpassionfruit.com/blog/gpt-5-1-vs-claude-4-5-sonnet-vs-gemini-3-pro-vs-deepseek-v3-2-the-definitive-2025-ai-model-comparison

[^19]: FourWeekMBA, "The Open Model Convergence: How the Frontier Gap Collapsed to 6 Months," 2025, https://fourweekmba.com/the-open-model-convergence-how-the-frontier-gap-collapsed-to-6-months/

[^20]: Various news sources tracking OpenAI departures, compiled December 2024

[^21]: LessWrong, "AI Safety Field Growth Analysis 2025," 2025, https://www.lesswrong.com/posts/8QjAnWyuE9fktPRgS/ai-safety-field-growth-analysis-2025

[^22]: Tech Policy Press, "AI Safety Researchers Criticize xAI Safety Culture," 2025

[^23]: U.S. Congress, "S.1792 - AI Whistleblower Protection Act," May 2025, https://www.congress.gov/bill/119th-congress/senate-bill/1792/text

[^24]: SF Public Press, "California AI Law Created Illusion of Whistleblower Protections," 2025, https://www.sfpublicpress.org/californias-new-ai-safety-law-created-the-illusion-of-whistleblower-protections/