Skip to content
Longterm Wiki

Evals-Based Deployment Gates

evals-governanceapproachPath: /knowledge-base/responses/evals-governance/
E459Entity ID (EID)
← Back to page2 backlinksQuality: 66Updated: 2026-01-29
Page Recorddatabase.json — merged from MDX frontmatter + Entity YAML + computed metrics at build time
{
  "id": "evals-governance",
  "wikiId": "E459",
  "path": "/knowledge-base/responses/evals-governance/",
  "filePath": "knowledge-base/responses/evals-governance.mdx",
  "title": "Evals-Based Deployment Gates",
  "quality": 66,
  "readerImportance": 41.5,
  "researchImportance": 70.5,
  "tacticalValue": null,
  "contentFormat": "article",
  "causalLevel": null,
  "lastUpdated": "2026-01-29",
  "dateCreated": "2026-02-15",
  "summary": "Evals-based deployment gates create formal checkpoints requiring AI systems to pass safety evaluations before deployment, with EU AI Act imposing fines up to EUR 35M/7% turnover and UK AISI testing 30+ models. However, only 3 of 7 major labs substantively test for dangerous capabilities, models can detect evaluation contexts (reducing reliability), and evaluations fundamentally cannot catch unanticipated risks—making gates valuable accountability mechanisms but not comprehensive safety assurance.",
  "description": "Evals-based deployment gates require AI models to pass safety evaluations before deployment or capability scaling.",
  "ratings": {
    "novelty": 4.5,
    "rigor": 7,
    "completeness": 7.5,
    "actionability": 7.5
  },
  "category": "responses",
  "subcategory": "alignment-policy",
  "clusters": [
    "ai-safety",
    "governance"
  ],
  "metrics": {
    "wordCount": 4073,
    "tableCount": 31,
    "diagramCount": 3,
    "internalLinks": 6,
    "externalLinks": 73,
    "footnoteCount": 0,
    "bulletRatio": 0.03,
    "sectionCount": 42,
    "hasOverview": true,
    "structuralScore": 15
  },
  "suggestedQuality": 100,
  "updateFrequency": 21,
  "evergreen": true,
  "wordCount": 4073,
  "unconvertedLinks": [
    {
      "text": "2025 AI Safety Index",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    },
    {
      "text": "EU AI Act",
      "url": "https://artificialintelligenceact.eu/",
      "resourceId": "1ad6dc89cded8b0c",
      "resourceTitle": "EU AI Act – Official Resource Hub"
    },
    {
      "text": "UK AISI Frontier AI Trends Report",
      "url": "https://www.aisi.gov.uk/frontier-ai-trends-report",
      "resourceId": "7042c7f8de04ccb1",
      "resourceTitle": "AISI Frontier AI Trends"
    },
    {
      "text": "EU AI Act",
      "url": "https://artificialintelligenceact.eu/",
      "resourceId": "1ad6dc89cded8b0c",
      "resourceTitle": "EU AI Act – Official Resource Hub"
    },
    {
      "text": "16 companies at the Seoul Summit",
      "url": "https://www.gov.uk/government/publications/frontier-ai-safety-commitments-ai-seoul-summit-2024/frontier-ai-safety-commitments-ai-seoul-summit-2024",
      "resourceId": "4487a62bbc1c45d6",
      "resourceTitle": "Seoul Frontier AI Safety Commitments"
    },
    {
      "text": "UK AI Security Institute",
      "url": "https://www.aisi.gov.uk/",
      "resourceId": "fdf68a8f30f57dee",
      "resourceTitle": "UK AI Safety Institute (AISI)"
    },
    {
      "text": "METR",
      "url": "https://metr.org/",
      "resourceId": "45370a5153534152",
      "resourceTitle": "METR: Model Evaluation and Threat Research"
    },
    {
      "text": "2025 AI Safety Index",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    },
    {
      "text": "International AI Safety Report 2025",
      "url": "https://internationalaisafetyreport.org/publication/international-ai-safety-report-2025",
      "resourceId": "b163447fdc804872",
      "resourceTitle": "International AI Safety Report 2025"
    },
    {
      "text": "Apollo Research",
      "url": "https://www.apolloresearch.ai/blog/claude-sonnet-37-often-knows-when-its-in-alignment-evaluations/",
      "resourceId": "f5ef9e486e36fbee",
      "resourceTitle": "Apollo Research found"
    },
    {
      "text": "EU AI Act",
      "url": "https://artificialintelligenceact.eu/",
      "resourceId": "1ad6dc89cded8b0c",
      "resourceTitle": "EU AI Act – Official Resource Hub"
    },
    {
      "text": "US EO 14110",
      "url": "https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence",
      "resourceId": "80350b150694b2ae",
      "resourceTitle": "Executive Order 14110"
    },
    {
      "text": "UK AISI",
      "url": "https://www.aisi.gov.uk/",
      "resourceId": "fdf68a8f30f57dee",
      "resourceTitle": "UK AI Safety Institute (AISI)"
    },
    {
      "text": "NIST AI RMF",
      "url": "https://www.nist.gov/artificial-intelligence",
      "resourceId": "85ee8e554a07476b",
      "resourceTitle": "Guidelines and standards"
    },
    {
      "text": "Anthropic RSP",
      "url": "https://www.anthropic.com/index/anthropics-responsible-scaling-policy",
      "resourceId": "c637506d2cd4d849",
      "resourceTitle": "Anthropic's Responsible Scaling Policy"
    },
    {
      "text": "OpenAI Preparedness",
      "url": "https://openai.com/preparedness",
      "resourceId": "90a03954db3c77d5",
      "resourceTitle": "OpenAI Preparedness Framework"
    },
    {
      "text": "EU AI Act",
      "url": "https://artificialintelligenceact.eu/",
      "resourceId": "1ad6dc89cded8b0c",
      "resourceTitle": "EU AI Act – Official Resource Hub"
    },
    {
      "text": "EU AI Act Implementation Timeline",
      "url": "https://artificialintelligenceact.eu/implementation-timeline/",
      "resourceId": "0aa9d7ba294a35d9",
      "resourceTitle": "EU AI Act Implementation Timeline"
    },
    {
      "text": "Anthropic estimate",
      "url": "https://www.congress.gov/crs-product/R47843",
      "resourceId": "7f5cff0680d15cc8",
      "resourceTitle": "Congress.gov CRS Report"
    },
    {
      "text": "UK AISI 2025 Year in Review",
      "url": "https://www.aisi.gov.uk/blog/our-2025-year-in-review",
      "resourceId": "3dec5f974c5da5ec",
      "resourceTitle": "Our 2025 Year in Review"
    },
    {
      "text": "METR",
      "url": "https://metr.org/",
      "resourceId": "45370a5153534152",
      "resourceTitle": "METR: Model Evaluation and Threat Research"
    },
    {
      "text": "Apollo Research",
      "url": "https://www.apolloresearch.ai/",
      "resourceId": "329d8c2e2532be3d",
      "resourceTitle": "Apollo Research - AI Safety Evaluation Organization"
    },
    {
      "text": "UK AISI",
      "url": "https://www.aisi.gov.uk/",
      "resourceId": "fdf68a8f30f57dee",
      "resourceTitle": "UK AI Safety Institute (AISI)"
    },
    {
      "text": "2025 AI Safety Index",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    },
    {
      "text": "Frontier AI Safety Commitments",
      "url": "https://www.gov.uk/government/publications/frontier-ai-safety-commitments-ai-seoul-summit-2024/frontier-ai-safety-commitments-ai-seoul-summit-2024",
      "resourceId": "4487a62bbc1c45d6",
      "resourceTitle": "Seoul Frontier AI Safety Commitments"
    },
    {
      "text": "METR Frontier AI Safety Policies Tracker",
      "url": "https://metr.org/faisc",
      "resourceId": "7e3b7146e1266c71",
      "resourceTitle": "METR's Analysis of Frontier AI Safety Cases (FAISC)"
    },
    {
      "text": "UK AISI Frontier AI Trends Report",
      "url": "https://www.aisi.gov.uk/frontier-ai-trends-report",
      "resourceId": "7042c7f8de04ccb1",
      "resourceTitle": "AISI Frontier AI Trends"
    },
    {
      "text": "Apollo Research",
      "url": "https://www.apolloresearch.ai/",
      "resourceId": "329d8c2e2532be3d",
      "resourceTitle": "Apollo Research - AI Safety Evaluation Organization"
    },
    {
      "text": "Claude Sonnet 3.7 often recognizes when it's in alignment evaluations",
      "url": "https://www.apolloresearch.ai/blog/claude-sonnet-37-often-knows-when-its-in-alignment-evaluations/",
      "resourceId": "f5ef9e486e36fbee",
      "resourceTitle": "Apollo Research found"
    },
    {
      "text": "UK-US joint model evaluation",
      "url": "https://www.aisi.gov.uk/",
      "resourceId": "fdf68a8f30f57dee",
      "resourceTitle": "UK AI Safety Institute (AISI)"
    },
    {
      "text": "Anthropic-OpenAI joint evaluation",
      "url": "https://alignment.anthropic.com/2025/openai-findings/",
      "resourceId": "2fdf91febf06daaf",
      "resourceTitle": "Anthropic-OpenAI joint evaluation"
    },
    {
      "text": "Frontier AI Trends Report",
      "url": "https://www.aisi.gov.uk/frontier-ai-trends-report",
      "resourceId": "7042c7f8de04ccb1",
      "resourceTitle": "AISI Frontier AI Trends"
    },
    {
      "text": "Joint UK-US pre-deployment evaluation of OpenAI o1",
      "url": "https://www.aisi.gov.uk/blog/pre-deployment-evaluation-of-openais-o1-model",
      "resourceId": "e23f70e673a090c1",
      "resourceTitle": "Pre-Deployment evaluation of OpenAI's o1 model"
    },
    {
      "text": "UK AISI 2025 Year in Review",
      "url": "https://www.aisi.gov.uk/blog/our-2025-year-in-review",
      "resourceId": "3dec5f974c5da5ec",
      "resourceTitle": "Our 2025 Year in Review"
    },
    {
      "text": "OpenAI-Apollo partnership",
      "url": "https://openai.com/index/detecting-and-reducing-scheming-in-ai-models/",
      "resourceId": "b3f335edccfc5333",
      "resourceTitle": "OpenAI Preparedness Framework"
    },
    {
      "text": "Bloom tool",
      "url": "https://alignment.anthropic.com/2025/bloom-auto-evals/",
      "resourceId": "7fa7d4cb797a5edd",
      "resourceTitle": "Bloom: Automated Behavioral Evaluations"
    },
    {
      "text": "Inspect tools",
      "url": "https://inspect.aisi.org.uk/",
      "resourceId": "fc3078f3c2ba5ebb",
      "resourceTitle": "UK AI Safety Institute's Inspect framework"
    },
    {
      "text": "International AI Safety Report 2025",
      "url": "https://internationalaisafetyreport.org/publication/international-ai-safety-report-2025",
      "resourceId": "b163447fdc804872",
      "resourceTitle": "International AI Safety Report 2025"
    },
    {
      "text": "EU AI Act",
      "url": "https://artificialintelligenceact.eu/",
      "resourceId": "1ad6dc89cded8b0c",
      "resourceTitle": "EU AI Act – Official Resource Hub"
    },
    {
      "text": "EU AI Act Implementation Timeline",
      "url": "https://artificialintelligenceact.eu/implementation-timeline/",
      "resourceId": "0aa9d7ba294a35d9",
      "resourceTitle": "EU AI Act Implementation Timeline"
    },
    {
      "text": "NIST AI RMF",
      "url": "https://www.nist.gov/artificial-intelligence/ai-standards",
      "resourceId": "e4c2d8b8332614cc",
      "resourceTitle": "NIST: AI Standards Portal"
    },
    {
      "text": "UK AISI 2025 Review",
      "url": "https://www.aisi.gov.uk/blog/our-2025-year-in-review",
      "resourceId": "3dec5f974c5da5ec",
      "resourceTitle": "Our 2025 Year in Review"
    },
    {
      "text": "UK AISI Evaluations Update",
      "url": "https://www.aisi.gov.uk/blog/advanced-ai-evaluations-may-update",
      "resourceId": "4e56cdf6b04b126b",
      "resourceTitle": "UK AI Safety Institute renamed to AI Security Institute"
    },
    {
      "text": "EO 14110",
      "url": "https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence",
      "resourceId": "80350b150694b2ae",
      "resourceTitle": "Executive Order 14110"
    },
    {
      "text": "Responsible Scaling Policy",
      "url": "https://www.anthropic.com/index/anthropics-responsible-scaling-policy",
      "resourceId": "c637506d2cd4d849",
      "resourceTitle": "Anthropic's Responsible Scaling Policy"
    },
    {
      "text": "Preparedness Framework",
      "url": "https://openai.com/preparedness",
      "resourceId": "90a03954db3c77d5",
      "resourceTitle": "OpenAI Preparedness Framework"
    },
    {
      "text": "Joint Evaluation Exercise",
      "url": "https://alignment.anthropic.com/2025/openai-findings/",
      "resourceId": "2fdf91febf06daaf",
      "resourceTitle": "Anthropic-OpenAI joint evaluation"
    },
    {
      "text": "Bloom Auto-Evals",
      "url": "https://alignment.anthropic.com/2025/bloom-auto-evals/",
      "resourceId": "7fa7d4cb797a5edd",
      "resourceTitle": "Bloom: Automated Behavioral Evaluations"
    },
    {
      "text": "Automated Auditing Agents",
      "url": "https://alignment.anthropic.com/2025/automated-auditing/",
      "resourceId": "bda3ba0731666dc7",
      "resourceTitle": "10-42% correct root cause identification"
    },
    {
      "text": "METR",
      "url": "https://metr.org/",
      "resourceId": "45370a5153534152",
      "resourceTitle": "METR: Model Evaluation and Threat Research"
    },
    {
      "text": "GPT-5 evaluation",
      "url": "https://evaluations.metr.org/gpt-5-report/",
      "resourceId": "7457262d461e2206",
      "resourceTitle": "Details about METR’s evaluation of OpenAI GPT-5"
    },
    {
      "text": "GPT-4.5 evals",
      "url": "https://metr.org/blog/2025-02-27-gpt-4-5-evals/",
      "resourceId": "a86b4f04559de6da",
      "resourceTitle": "METR’s GPT-4.5 pre-deployment evaluations"
    },
    {
      "text": "Apollo Research",
      "url": "https://www.apolloresearch.ai/",
      "resourceId": "329d8c2e2532be3d",
      "resourceTitle": "Apollo Research - AI Safety Evaluation Organization"
    },
    {
      "text": "UK AISI",
      "url": "https://www.aisi.gov.uk/",
      "resourceId": "fdf68a8f30f57dee",
      "resourceTitle": "UK AI Safety Institute (AISI)"
    },
    {
      "text": "Frontier AI Trends Report",
      "url": "https://www.aisi.gov.uk/frontier-ai-trends-report",
      "resourceId": "7042c7f8de04ccb1",
      "resourceTitle": "AISI Frontier AI Trends"
    },
    {
      "text": "Inspect framework",
      "url": "https://inspect.aisi.org.uk/",
      "resourceId": "fc3078f3c2ba5ebb",
      "resourceTitle": "UK AI Safety Institute's Inspect framework"
    },
    {
      "text": "Future of Life Institute",
      "url": "https://futureoflife.org/",
      "resourceId": "786a68a91a7d5712",
      "resourceTitle": "Future of Life Institute"
    },
    {
      "text": "AI Safety Index",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    },
    {
      "text": "AI Safety Index 2025",
      "url": "https://futureoflife.org/ai-safety-index-summer-2025/",
      "resourceId": "df46edd6fa2078d1",
      "resourceTitle": "FLI AI Safety Index Summer 2025"
    }
  ],
  "unconvertedLinkCount": 59,
  "convertedLinkCount": 0,
  "backlinkCount": 2,
  "hallucinationRisk": {
    "level": "low",
    "score": 30,
    "factors": [
      "no-citations",
      "high-rigor",
      "conceptual-content"
    ]
  },
  "entityType": "approach",
  "redundancy": {
    "maxSimilarity": 19,
    "similarPages": [
      {
        "id": "rsp",
        "title": "Responsible Scaling Policies",
        "path": "/knowledge-base/responses/rsp/",
        "similarity": 19
      },
      {
        "id": "model-auditing",
        "title": "Third-Party Model Auditing",
        "path": "/knowledge-base/responses/model-auditing/",
        "similarity": 18
      },
      {
        "id": "dangerous-cap-evals",
        "title": "Dangerous Capability Evaluations",
        "path": "/knowledge-base/responses/dangerous-cap-evals/",
        "similarity": 17
      },
      {
        "id": "evals",
        "title": "Evals & Red-teaming",
        "path": "/knowledge-base/responses/evals/",
        "similarity": 17
      },
      {
        "id": "intervention-effectiveness-matrix",
        "title": "Intervention Effectiveness Matrix",
        "path": "/knowledge-base/models/intervention-effectiveness-matrix/",
        "similarity": 15
      }
    ]
  },
  "changeHistory": [
    {
      "date": "2026-02-15",
      "branch": "claude/extract-wiki-interventions-WpOs4",
      "title": "Extract wiki proposals as structured data",
      "summary": "Created two new data layers:\n1. **Interventions** (broad categories): Extended `Intervention` schema with risk coverage matrix, ITN prioritization, funding data. Created `data/interventions.yaml` with 14 broad intervention categories. `InterventionCard`/`InterventionList` components.\n2. **Proposals** (narrow, tactical): New `Proposal` data type for specific, speculative, actionable items extracted from wiki pages. Created `data/proposals.yaml` with 27 proposals across 6 domains (philanthropic, financial, governance, technical, biosecurity, field-building). Each has cost/EV estimates, honest concerns, feasibility, stance (collaborative/adversarial). `ProposalCard`/`ProposalList` components.\n\nPost-review fixes: Fixed 13 incorrect wikiPageId E-codes in interventions.yaml (used numeric IDs instead of entity slugs). Added Intervention + Proposal to schema validator. Extracted shared badge color maps from 4 components into `badge-styles.ts`. Removed unused `client:load` prop and `fundingShare` destructure.",
      "pr": 141
    }
  ],
  "coverage": {
    "passing": 9,
    "total": 13,
    "targets": {
      "tables": 16,
      "diagrams": 2,
      "internalLinks": 33,
      "externalLinks": 20,
      "footnotes": 12,
      "references": 12
    },
    "actuals": {
      "tables": 31,
      "diagrams": 3,
      "internalLinks": 6,
      "externalLinks": 73,
      "footnotes": 0,
      "references": 28,
      "quotesWithQuotes": 0,
      "quotesTotal": 0,
      "accuracyChecked": 0,
      "accuracyTotal": 0
    },
    "items": {
      "summary": "green",
      "schedule": "green",
      "entity": "green",
      "editHistory": "green",
      "overview": "green",
      "tables": "green",
      "diagrams": "green",
      "internalLinks": "amber",
      "externalLinks": "green",
      "footnotes": "red",
      "references": "green",
      "quotes": "red",
      "accuracy": "red"
    },
    "editHistoryCount": 1,
    "ratingsString": "N:4.5 R:7 A:7.5 C:7.5"
  },
  "readerRank": 363,
  "researchRank": 145,
  "recommendedScore": 166.27
}
External Links
{
  "lesswrong": "https://www.lesswrong.com/tag/ai-evaluations"
}
Backlinks (2)
idtitletyperelationship
alignment-policy-overviewPolicy & Governance (Overview)concept
forecasting-based-policy-triggersForecasting-Based Policy Triggersapproach
Longterm Wiki