Automation Tools

This page documents all automation tools available for maintaining and improving the knowledge base.

Quick Reference

Tool	Purpose	Command
Content Commands	Improve, grade, create pages	`pnpm crux w`
Validators	Check content quality	`pnpm crux w validate`
Analyzers	Analysis and reporting	`pnpm crux w analyze`
Auto-fixers	Fix common issues	`pnpm crux w fix`
Data Builder	Regenerate entity data	`pnpm build-data`
Resources	External resource management	`pnpm crux w resources`

Page Improvement Workflow

The recommended way to improve wiki pages to quality 5 (Q5).

Commands

# List pages that need improvement (sorted by priority)
pnpm crux w improve --list

# Improve a specific page
pnpm crux w improve economic-disruption --tier=standard --apply

# Show page info only (no prompt)
pnpm crux w improve racing-dynamics --info

# Filter by quality and importance
pnpm crux w improve --list --max-qual 3 --min-imp 50

What Makes a Q5 Page

Q5 is an internally-defined quality standard used by the grading pipeline to classify pages as comprehensive.¹ The following elements are required:

Element	Requirement
Quick Assessment Table	5+ rows, 3 columns (Dimension, Assessment, Evidence)
Substantive Tables	2+ additional tables with real data
Mermaid Diagram	1+ showing key relationships²
Citations	10+ real URLs from authoritative sources
Quantified Claims	Replace "significant" with "25-40%" etc.
Word Count	800+ words of substantive content

Cost Estimates

Cost per page depends on the improvement tier and the Claude model used.³ The estimates below are approximate and reflect typical token usage; actual costs vary with page length and complexity.

Model	Approximate Cost per Page
Claude Opus 4.5	$3–5
Claude Sonnet 4.5	$0.50–1.00

Reference Examples

The following pages meet Q5 criteria and can serve as structural references:

content/docs/knowledge-base/risks/bioweapons.mdx — includes Quick Assessment Table, 2+ data tables, 1+ Mermaid diagram, and 10+ citations
content/docs/knowledge-base/risks/racing-dynamics.mdx — demonstrates quantified claims and structured evidence sections

Scheduled Updates

The update schedule system tracks which pages are overdue for refresh based on update_frequency (days) in frontmatter. It prioritizes pages by combining staleness with importance.

How Scheduling Works

Each page declares how often it should be updated:

update_frequency: 7     # Check weekly
lastEdited: "2026-01-15"
readerImportance: 85

Priority scoring formula: staleness × (importance / 100), where staleness = days since edit / update frequency. Pages with staleness >= 1.0 are overdue.

Importance	Default Frequency
>= 80	7 days (weekly)
>= 60	21 days (3 weeks)
>= 40	45 days (6 weeks)
>= 20	90 days (3 months)

Opting Out: Non-Evergreen Pages

Some pages are point-in-time content (reports, experiments, proposals) that don't need ongoing updates. Set evergreen: false in frontmatter to exclude them from the update schedule:

evergreen: false    # This page is a snapshot, not maintained

Pages with evergreen: false are skipped by the update schedule, the bootstrap script, and the reassign script. All existing report pages under internal/reports/ use this flag.

Commands

# See what needs updating
pnpm crux updates list                        # Top 10 by priority
pnpm crux updates list --overdue --limit=20   # All overdue pages

# Preview triage recommendations (~$0.08/page based on Sonnet 4.5 rates)
pnpm crux updates triage --count=10

# Run updates (triage is ON by default)
pnpm crux updates run --count=5               # Triage + update top 5
pnpm crux updates run --count=3 --no-triage --tier=polish  # Skip triage

# Statistics
pnpm crux updates stats

Cost-Aware Triage

By default, updates run performs a low-cost news check (≈$0.08 per page at Sonnet 4.5 rates)³ using web search + SCRY before committing to a full update. This check recommends a tier based on whether new developments were found:

Triage Result	Action	Cost
skip	No new developments — page is current	$0
polish	Minor tweaks only	$2–3³
standard	Notable new developments to add	$5–8³
deep	Major developments requiring thorough research	$10–15³

Example scenario: 10 pages scheduled at standard tier ≈ $65 total. If triage finds that 6 have no new developments, those are skipped: ≈$26 (4 pages × standard) + $0.80 triage overhead ≈ $27. Note that when most pages do require updates, triage overhead adds cost without reducing the update spend; the net benefit depends on the proportion of pages that can be skipped.

Use --no-triage to skip the news check and apply a fixed tier to all pages.

Update Tiers (Page Improver)

# Single page with specific tier
pnpm crux w improve <page-id> --tier=polish --apply
pnpm crux w improve <page-id> --tier=standard --apply --grade
pnpm crux w improve <page-id> --tier=deep --apply --grade

# Auto-select tier via triage
pnpm crux w improve <page-id> --tier=triage --apply --grade

# Triage only (no update)
pnpm crux w improve <page-id> --triage

Tier	Approximate Cost³	Phases
polish	$2–3	analyze, improve, validate
standard	$5–8	analyze, research, improve, validate, review
deep	$10–15	analyze, research-deep, improve, validate, review, gap-fill
triage	≈$0.08 + tier cost	news check, then auto-selects above

Page Change Tracking

Claude Code sessions log which wiki pages they modify. This creates a per-page change history that feeds two dashboards.

How It Works

Each Claude Code session appends an entry to .claude/sessions/ with a **Pages:** field listing the page slugs that were edited.
At build time, build-data.mjs parses session logs and attaches a changeHistory array to each page in database.json.
PR numbers are auto-populated at build time via the GitHub API (github-pr-lookup.mjs), mapping branch names to PR numbers. Session logs can also include an explicit **PR:** #123 as an override.
The data flows to two places:
- Per-page: A "Change History" section in the PageStatus card shows which sessions touched this page, with clickable PR links.
- Site-wide: The Page Changes dashboard shows all page edits across all sessions in a sortable table with PR links.

Session Log Format

The **Pages:** field uses page slugs (filenames without .mdx), comma-separated:

**Pages:** ai-risks, compute-governance, anthropic

Omit the field entirely for infrastructure-only sessions that don't edit wiki pages.

Content Grading

Uses the Claude Sonnet API to automatically grade pages with importance, quality, and AI-generated summaries.³

Commands

# Preview what would be graded (no API calls)
pnpm crux w grade --dry-run

# Grade a specific page
pnpm crux w grade --page scheming

# Grade pages and apply to frontmatter
pnpm crux w grade --limit 10 --apply

# Grade a category with parallel processing
pnpm crux w grade --category responses --parallel 3

# Skip already-graded pages
pnpm crux w grade --skip-graded --limit 50

Options

Option	Description
`--page ID`	Grade a single page
`--dry-run`	Preview without API calls
`--limit N`	Only process N pages
`--parallel N`	Process N pages concurrently (default: 1)
`--category X`	Only process pages in category
`--skip-graded`	Skip pages with existing importance
`--apply`	Write grades to frontmatter (caution)
`--output FILE`	Write results to JSON file

Grading Criteria

Importance (0–100):

90–100: Essential for prioritization (core interventions, key risk mechanisms)
70–89: High value (concrete responses, major risk categories)
50–69: Useful context (supporting analysis, secondary risks)
30–49: Reference material (historical, profiles, niche)
0–29: Peripheral (internal docs, stubs)

Quality (0–100):

80–100: Comprehensive (2+ tables, 1+ diagram, 5+ citations, quantified claims) — maps to Q5¹
60–79: Good (1+ table, 3+ citations, mostly prose)
40–59: Adequate (structure but lacks tables/citations)
20–39: Draft (poorly structured, heavy bullets, no evidence)
0–19: Stub (minimal content)

Cost Estimate

Approximately $0.02 per page at Sonnet 4.5 rates.³ Total cost for the full wiki (approximately 329 pages as of early 2026) is approximately $6–7, though this figure changes as pages are added.

Validation Suite

All validators are accessible via the unified crux CLI:

pnpm crux w validate              # Run all validators
pnpm crux w validate --help       # List all validators

Individual Validators

Command	Description
`crux validate compile`	MDX⁴ compilation check
`crux validate data`	Entity data integrity
`crux validate refs`	Internal reference validation
`crux validate mermaid`	Mermaid² diagram syntax
`crux validate sidebar`	Sidebar configuration
`crux validate entity-links`	EntityLink component validation
`crux validate templates`	Template compliance
`crux validate quality`	Content quality metrics
`crux validate unified`	Unified rules engine (escaping, formatting)

Advanced Usage

# Run specific validators
pnpm crux w validate compile --quick
pnpm crux w validate unified --rules=dollar-signs,markdown-lists

# Skip specific checks
pnpm crux w validate all --skip=component-refs

# CI mode
pnpm crux w validate gate

Citation & Content Tools

Tools for verifying citations, fetching source content, and scanning wiki pages. Data is stored in the wiki-server PostgreSQL database and accessed via Hono RPC API.

Citation Verification

# Verify all citations on a page (fetches URLs, checks quotes)
pnpm crux citations verify <page-id>

# Run citation audits across multiple pages
pnpm crux citations audit

# View citation archive for a page
pnpm crux citations show <page-id>

Citation results are stored in two places:

YAML archive (data/citation-archive/<page-id>.yaml) — per-page verification records, checked into git
PostgreSQL (citation_content table) — full fetched text for quote verification, accessed via wiki-server API

Content Scanning

# Scan MDX files for content analysis
pnpm crux scan-content

Source Fetching

When verifying citations, the system fetches source URLs through a multi-layer cache:

In-memory LRU cache (500 entries, session-scoped; implementation-defined in crux/lib/source-cache.ts)
PostgreSQL citation_content (durable, cross-machine)
Network fetch via Firecrawl API⁵ or built-in fallback

Fetched content is cached in .cache/sources/ locally and persisted to PostgreSQL for future sessions.

Environment Variables

Variable	Purpose
`ANTHROPIC_BILLING_KEY`	Claude API³ for summaries and grading
`FIRECRAWL_KEY`	Web page fetching via Firecrawl⁵ (optional, has built-in fallback)

Data Layer

Build Data

The data build step must run before the site build, as apps/web reads database.json at build time rather than making runtime API calls.

pnpm build-data       # Regenerate all data files
pnpm dev              # Auto-runs build-data first
pnpm build            # Auto-runs build-data first

Generated Files

After running build-data:

apps/web/src/data/database.json — Main entity database
apps/web/src/data/entities.json — Entity definitions
apps/web/src/data/backlinks.json — Cross-references
apps/web/src/data/tagIndex.json — Tag index
apps/web/src/data/pathRegistry.json — URL path mappings
apps/web/src/data/pages.json — Page metadata for scripts

Other Data Scripts

pnpm crux tb sync:descriptions    # Sync model descriptions from files
pnpm crux tb extract              # Extract data from pages
pnpm crux tb generate-yaml        # Generate YAML from data
pnpm crux tb cleanup-data         # Clean up data files

Content Management CLI

Unified tool for managing and improving content quality via pnpm crux w.

Commands

# Improve pages
pnpm crux w improve <page-id>

# Grade pages using Claude API
pnpm crux w grade --page scheming
pnpm crux w grade --limit 5 --apply

# Regrade pages
pnpm crux w regrade --page scheming

# Create new pages
pnpm crux w create "Page Title" --tier=standard

Options

Option	Description
`--dry-run`	Preview without API calls
`--limit N`	Process only N pages
`--apply`	Apply changes directly to files
`--page ID`	Target specific page

Resource Linking

Convert URLs to R Components

# Find URLs that can be converted to <R> components
pnpm crux w resources map expertise-atrophy   # Specific file
pnpm crux w resources map                     # All files
pnpm crux w resources map --stats             # Statistics only

# Auto-convert markdown links to R components
pnpm crux w resources convert --dry-run       # Preview
pnpm crux w resources convert --apply         # Apply changes

Export Resources

pnpm crux w resources export    # Export resource data

Content Generation

Generate New Pages

# Generate a model page from YAML input
pnpm crux w create "Model Name" --type model --file input.yaml

# Generate a risk page
pnpm crux w create "Risk Name" --type risk --file input.yaml

# Generate a response page
pnpm crux w create "Response Name" --type response --file input.yaml

Batch Summaries

pnpm crux w generate summaries --batch 50  # Generate summaries for multiple pages

Testing

pnpm test            # Run all tests
pnpm test:lib        # Test library functions
pnpm test:validators # Test validator functions

Linting and Formatting

pnpm lint           # Check for linting issues
pnpm lint:fix       # Fix linting issues
pnpm format         # Format all files
pnpm format:check   # Check formatting without changing

Temporary Files

Convention: All temporary/intermediate files go in .claude/temp/ (gitignored).

Scripts that generate intermediate output (like grading results) write here by default. This keeps the project root clean and prevents accidental commits.

Common Workflows

Improve a Low-Quality Important Page

Find candidates:

pnpm crux w improve --list --max-qual 3

Run improvement:

pnpm crux w improve economic-disruption --tier=standard --apply

Validate the result:

pnpm crux w validate compile
pnpm crux w validate gate

Grade All New Pages

Preview:

pnpm crux w grade --skip-graded --dry-run

Grade and apply:

pnpm crux w grade --skip-graded --apply --parallel 3

Review results:
```
cat .claude/temp/grades-output.json
```

Check Content Quality Before PR

pnpm crux w validate gate --fix

Update After Editing entities.yaml

pnpm build-data
pnpm crux w validate data

Q5 quality standards are defined internally within this wiki's content pipeline documentation. Q5 maps to the 80–100 range on the grading scale and requires: Quick Assessment Table (5+ rows), 2+ substantive tables, 1+ Mermaid diagram, 10+ authoritative-source citations, quantified claims, and 800+ words of substantive content. See crux/lib/page-templates.ts for the authoritative implementation. ↩ ↩²
Diagram Syntax | Mermaid — Mermaid.js project, 2026. Version 11.13.0 supports 25+ diagram types. All diagram definitions begin with a diagram type declaration; comments use %% notation; configuration via YAML frontmatter or %%{ }%% directives. ↩ ↩²
Pricing — Claude API Docs — Anthropic, 2026. Claude Opus 4.5: $5/MTok input, $25/MTok output. Claude Sonnet 4.5: $3/MTok input, $15/MTok output. Batch API provides a flat 50% discount on all models (Sonnet 4.5 batch: $1.50/$7.50 per MTok). All cost estimates on this page are approximate, reflect typical token usage per page, and are subject to change with Anthropic pricing updates. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹
Docs | MDX — MDX project, 2026. Core compilation package @mdx-js/mdx with remark+rehype pipeline. Supports markdown, JSX, JavaScript expressions, and ESM imports/exports in one format. ↩
Firecrawl — Web Data API for AI — Firecrawl, 2026. Web scraping and crawling API; 1 credit per scraped page. Free tier provides 500 lifetime credits. The FIRECRAWL_KEY environment variable enables Firecrawl-backed fetching; the system falls back to a built-in fetch implementation if the key is absent. ↩ ↩²

Automation Tools

Quick Reference

Page Improvement Workflow

Commands

What Makes a Q5 Page

Cost Estimates

Reference Examples

Scheduled Updates

How Scheduling Works

Opting Out: Non-Evergreen Pages

Commands

Cost-Aware Triage

Update Tiers (Page Improver)

Page Change Tracking

How It Works

Session Log Format

Content Grading

Commands

Options

Grading Criteria

Cost Estimate

Validation Suite

Individual Validators

Advanced Usage

Citation & Content Tools

Citation Verification

Content Scanning

Source Fetching

Environment Variables

Data Layer

Build Data

Generated Files

Other Data Scripts

Content Management CLI

Commands

Options

Resource Linking

Convert URLs to R Components

Export Resources

Content Generation

Generate New Pages

Batch Summaries

Testing

Linting and Formatting

Temporary Files

Common Workflows

Improve a Low-Quality Important Page

Grade All New Pages

Check Content Quality Before PR

Update After Editing entities.yaml

Footnotes