Skip to content
Longterm Wiki

All Source Checks

Automated source checking of wiki data against original sources. Each record is checked against one or more external sources to confirm accuracy.

View internal dashboard with coverage & action queue →

Verified Correct

7,562

74% of checked

Has Issues

1,363

13% of checked

Can't Verify

1,314

13% of checkedincl. 265 dead links

Not Yet Checked

0

of 10,239 total

Contradicted

141

Fix now — data may be wrong

Outdated

28

Source has newer info

Accuracy Rate

98%

confirmed / (confirmed + wrong + outdated)

Needs Recheck

0

All up to date

9414 results
Policy Stakeholderconfirmed

mGU0aT9Q2L

Apr 29, 2026
Funding Roundconfirmed

LS8icm7VoX

Apr 29, 2026
Funding Roundconfirmed

aBhKPMXJYl

Apr 29, 2026
Funding Roundconfirmed

vC9x3Q6kMh

Apr 29, 2026
Divisionconfirmed

CAIS Action Fund

CAIS Action Fund·Apr 29, 2026
Divisionconfirmed

Compute Cluster

Compute Cluster·Apr 29, 2026
Benchmark Resultconfirmed

sid_ePVee3jidQ / MMMU: 69.1

Claude 3.7 Sonnet·Apr 24, 2026
Benchmark Resultconfirmed

sid_ePVee3jidQ / LiveCodeBench: 65.4

Claude 3.7 Sonnet·Apr 24, 2026
Benchmark Resultconfirmed

sid_ePVee3jidQ / MMLU-Pro: 78.4

Claude 3.7 Sonnet·Apr 24, 2026
Benchmark Resultconfirmed

sid_ePVee3jidQ / GSM8K: 96.4

Claude 3.7 Sonnet·Apr 24, 2026
Benchmark Resultconfirmed

sid_ePVee3jidQ / HumanEval: 94

Claude 3.7 Sonnet·Apr 24, 2026
Benchmark Resultconfirmed

sid_ISfAiImMYg / SWE-bench Verified: 49

Claude 3.5 Sonnet·Apr 24, 2026
Benchmark Resultconfirmed

sid_ISfAiImMYg / GSM8K: 96.4

Claude 3.5 Sonnet·Apr 24, 2026
Benchmark Resultconfirmed

sid_v1e1ZwDwoA / HellaSwag: 84

Mistral·Apr 24, 2026
Benchmark Resultconfirmed

sid_v1e1ZwDwoA / GSM8K: 40.3

Mistral·Apr 24, 2026
Benchmark Resultconfirmed

sid_v1e1ZwDwoA / HumanEval: 30.5

Mistral·Apr 24, 2026
Benchmark Resultconfirmed

sid_v1e1ZwDwoA / MMLU: 60.1

Mistral·Apr 24, 2026
Benchmark Resultconfirmed

sid_nnv09Wl5OQ / LiveCodeBench: 79.4

Grok·Apr 24, 2026
Benchmark Resultconfirmed

sid_nnv09Wl5OQ / Chatbot Arena Elo: 1402

Grok·Apr 24, 2026
Benchmark Resultconfirmed

sid_nnv09Wl5OQ / GSM8K: 89.3

Grok·Apr 24, 2026
Benchmark Resultconfirmed

sid_nnv09Wl5OQ / MMLU-Pro: 79.9

Grok·Apr 24, 2026
Benchmark Resultconfirmed

sid_nnv09Wl5OQ / HumanEval: 86.5

Grok·Apr 24, 2026
Benchmark Resultconfirmed

sid_nywmt9QdsA / MMLU: 80.1

GPT-4.1 mini·Apr 24, 2026
Benchmark Resultconfirmed

sid_Gqv7h9oEwA / HellaSwag: 95

GPT·Apr 24, 2026
Benchmark Resultconfirmed

sid_Gqv7h9oEwA / GSM8K: 92

GPT·Apr 24, 2026
Benchmark Resultconfirmed

sid_Gqv7h9oEwA / MATH: 76.6

GPT·Apr 24, 2026
Benchmark Resultconfirmed

sid_Gqv7h9oEwA / MGSM: 90.5

GPT·Apr 24, 2026
Benchmark Resultconfirmed

sid_Gqv7h9oEwA / HumanEval: 90.2

GPT·Apr 24, 2026
Benchmark Resultconfirmed

sid_Gqv7h9oEwA / MMLU: 88.7

GPT·Apr 24, 2026
Benchmark Resultconfirmed

sid_PaKhQQNPkg / MMLU: 92.4

Gemini·Apr 24, 2026
Benchmark Resultconfirmed

sid_PaKhQQNPkg / HumanEval: 89.7

Gemini·Apr 24, 2026
Benchmark Resultconfirmed

sid_PaKhQQNPkg / MATH: 78.3

Gemini·Apr 24, 2026
Benchmark Resultconfirmed

sid_PaKhQQNPkg / Humanity's Last Exam: 44.4

Gemini·Apr 24, 2026
Benchmark Resultconfirmed

sid_PaKhQQNPkg / SWE-bench Verified: 80.6

Gemini·Apr 24, 2026
Benchmark Resultconfirmed

sid_PaKhQQNPkg / ARC-AGI-2: 77.1

Gemini·Apr 24, 2026
Benchmark Resultconfirmed

sid_PaKhQQNPkg / MMLU-Pro: 90.99

Gemini·Apr 24, 2026
Benchmark Resultconfirmed

sid_PaKhQQNPkg / GPQA Diamond: 94.3

Gemini·Apr 24, 2026
Benchmark Resultconfirmed

sid_svlbcrT5oQ / GPQA Diamond: 59.1

DeepSeek Models·Apr 24, 2026
Benchmark Resultconfirmed

sid_svlbcrT5oQ / DROP: 91.6

DeepSeek Models·Apr 24, 2026
Benchmark Resultconfirmed

sid_svlbcrT5oQ / MMLU-Pro: 75.9

DeepSeek Models·Apr 24, 2026
Benchmark Resultconfirmed

sid_svlbcrT5oQ / BBH: 87.5

DeepSeek Models·Apr 24, 2026
Benchmark Resultconfirmed

sid_svlbcrT5oQ / HumanEval: 65.2

DeepSeek Models·Apr 24, 2026
Benchmark Resultconfirmed

sid_svlbcrT5oQ / GSM8K: 89.3

DeepSeek Models·Apr 24, 2026
Benchmark Resultconfirmed

sid_svlbcrT5oQ / MMLU: 88.5

DeepSeek Models·Apr 24, 2026
Benchmark Resultconfirmed

sid_svlbcrT5oQ / MATH: 61.6

DeepSeek Models·Apr 24, 2026
Benchmark Resultconfirmed

sid_dHgSM46fMw / GPQA Diamond: 80.9

Claude Opus 4.1·Apr 24, 2026
Benchmark Resultconfirmed

sid_dHgSM46fMw / SWE-bench Verified: 74.5

Claude Opus 4.1·Apr 24, 2026
Benchmark Resultconfirmed

sid_y87VxEBBIA / SWE-bench Verified: 73.3

Claude Haiku 4.5·Apr 24, 2026
Citationconfirmed

xAI - Footnote 12

xAI - Footnote 12·Apr 3, 2026
Citationconfirmed

Will MacAskill - Footnote 18

Will MacAskill - Footnote 18·Apr 3, 2026
Showing 92519300 of 9,414
PrevPage 186 of 189Next

Data from source_check_verdicts table. Click a row to view detailed evidence.