All Source Checks
Automated source checking of wiki data against original sources. Each record is checked against one or more external sources to confirm accuracy.
View internal dashboard with coverage & action queue →Verified Correct
7,562
74% of checked
Has Issues
1,363
13% of checked
Can't Verify
1,314
13% of checkedincl. 265 dead links
Not Yet Checked
0
of 10,239 total
Contradicted
141
Fix now — data may be wrong
Outdated
28
Source has newer info
Accuracy Rate
98%
confirmed / (confirmed + wrong + outdated)
Needs Recheck
0
All up to date
mGU0aT9Q2L
LS8icm7VoX
aBhKPMXJYl
vC9x3Q6kMh
CAIS Action Fund
Compute Cluster
sid_ePVee3jidQ / MMMU: 69.1
sid_ePVee3jidQ / LiveCodeBench: 65.4
sid_ePVee3jidQ / MMLU-Pro: 78.4
sid_ePVee3jidQ / GSM8K: 96.4
sid_ePVee3jidQ / HumanEval: 94
sid_ISfAiImMYg / SWE-bench Verified: 49
sid_ISfAiImMYg / GSM8K: 96.4
sid_v1e1ZwDwoA / HellaSwag: 84
sid_v1e1ZwDwoA / GSM8K: 40.3
sid_v1e1ZwDwoA / HumanEval: 30.5
sid_v1e1ZwDwoA / MMLU: 60.1
sid_nnv09Wl5OQ / LiveCodeBench: 79.4
sid_nnv09Wl5OQ / Chatbot Arena Elo: 1402
sid_nnv09Wl5OQ / GSM8K: 89.3
sid_nnv09Wl5OQ / MMLU-Pro: 79.9
sid_nnv09Wl5OQ / HumanEval: 86.5
sid_nywmt9QdsA / MMLU: 80.1
sid_Gqv7h9oEwA / HellaSwag: 95
sid_Gqv7h9oEwA / GSM8K: 92
sid_Gqv7h9oEwA / MATH: 76.6
sid_Gqv7h9oEwA / MGSM: 90.5
sid_Gqv7h9oEwA / HumanEval: 90.2
sid_Gqv7h9oEwA / MMLU: 88.7
sid_PaKhQQNPkg / MMLU: 92.4
sid_PaKhQQNPkg / HumanEval: 89.7
sid_PaKhQQNPkg / MATH: 78.3
sid_PaKhQQNPkg / Humanity's Last Exam: 44.4
sid_PaKhQQNPkg / SWE-bench Verified: 80.6
sid_PaKhQQNPkg / ARC-AGI-2: 77.1
sid_PaKhQQNPkg / MMLU-Pro: 90.99
sid_PaKhQQNPkg / GPQA Diamond: 94.3
sid_svlbcrT5oQ / GPQA Diamond: 59.1
sid_svlbcrT5oQ / DROP: 91.6
sid_svlbcrT5oQ / MMLU-Pro: 75.9
sid_svlbcrT5oQ / BBH: 87.5
sid_svlbcrT5oQ / HumanEval: 65.2
sid_svlbcrT5oQ / GSM8K: 89.3
sid_svlbcrT5oQ / MMLU: 88.5
sid_svlbcrT5oQ / MATH: 61.6
sid_dHgSM46fMw / GPQA Diamond: 80.9
sid_dHgSM46fMw / SWE-bench Verified: 74.5
sid_y87VxEBBIA / SWE-bench Verified: 73.3
xAI - Footnote 12
Will MacAskill - Footnote 18
| Type | Entity | Claim | Verdict | Confidence | Sources | Last Checked | |
|---|---|---|---|---|---|---|---|
| Policy Stakeholder | - | mGU0aT9Q2L | confirmed | 95% | 1 | Apr 29, 2026 | |
| Funding Round | - | LS8icm7VoX | confirmed | 95% | 1 | Apr 29, 2026 | |
| Funding Round | - | aBhKPMXJYl | confirmed | 95% | 1 | Apr 29, 2026 | |
| Funding Round | - | vC9x3Q6kMh | confirmed | 95% | 1 | Apr 29, 2026 | |
| Division | - | CAIS Action Fund | confirmed | 95% | 1 | Apr 29, 2026 | |
| Division | - | Compute Cluster | confirmed | 95% | 1 | Apr 29, 2026 | |
| Benchmark Result | Claude 3.7 Sonnet | sid_ePVee3jidQ / MMMU: 69.1 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | Claude 3.7 Sonnet | sid_ePVee3jidQ / LiveCodeBench: 65.4 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | Claude 3.7 Sonnet | sid_ePVee3jidQ / MMLU-Pro: 78.4 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | Claude 3.7 Sonnet | sid_ePVee3jidQ / GSM8K: 96.4 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | Claude 3.7 Sonnet | sid_ePVee3jidQ / HumanEval: 94 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | Claude 3.5 Sonnet | sid_ISfAiImMYg / SWE-bench Verified: 49 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Benchmark Result | Claude 3.5 Sonnet | sid_ISfAiImMYg / GSM8K: 96.4 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | Mistral | sid_v1e1ZwDwoA / HellaSwag: 84 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | Mistral | sid_v1e1ZwDwoA / GSM8K: 40.3 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Benchmark Result | Mistral | sid_v1e1ZwDwoA / HumanEval: 30.5 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Benchmark Result | Mistral | sid_v1e1ZwDwoA / MMLU: 60.1 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Benchmark Result | Grok | sid_nnv09Wl5OQ / LiveCodeBench: 79.4 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | Grok | sid_nnv09Wl5OQ / Chatbot Arena Elo: 1402 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | Grok | sid_nnv09Wl5OQ / GSM8K: 89.3 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | Grok | sid_nnv09Wl5OQ / MMLU-Pro: 79.9 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | Grok | sid_nnv09Wl5OQ / HumanEval: 86.5 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | GPT-4.1 mini | sid_nywmt9QdsA / MMLU: 80.1 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Benchmark Result | GPT | sid_Gqv7h9oEwA / HellaSwag: 95 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | GPT | sid_Gqv7h9oEwA / GSM8K: 92 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | GPT | sid_Gqv7h9oEwA / MATH: 76.6 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | GPT | sid_Gqv7h9oEwA / MGSM: 90.5 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | GPT | sid_Gqv7h9oEwA / HumanEval: 90.2 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | GPT | sid_Gqv7h9oEwA / MMLU: 88.7 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Benchmark Result | Gemini | sid_PaKhQQNPkg / MMLU: 92.4 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | Gemini | sid_PaKhQQNPkg / HumanEval: 89.7 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | Gemini | sid_PaKhQQNPkg / MATH: 78.3 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | Gemini | sid_PaKhQQNPkg / Humanity's Last Exam: 44.4 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | Gemini | sid_PaKhQQNPkg / SWE-bench Verified: 80.6 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Benchmark Result | Gemini | sid_PaKhQQNPkg / ARC-AGI-2: 77.1 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Benchmark Result | Gemini | sid_PaKhQQNPkg / MMLU-Pro: 90.99 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Benchmark Result | Gemini | sid_PaKhQQNPkg / GPQA Diamond: 94.3 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Benchmark Result | DeepSeek Models | sid_svlbcrT5oQ / GPQA Diamond: 59.1 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | DeepSeek Models | sid_svlbcrT5oQ / DROP: 91.6 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | DeepSeek Models | sid_svlbcrT5oQ / MMLU-Pro: 75.9 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | DeepSeek Models | sid_svlbcrT5oQ / BBH: 87.5 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | DeepSeek Models | sid_svlbcrT5oQ / HumanEval: 65.2 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | DeepSeek Models | sid_svlbcrT5oQ / GSM8K: 89.3 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | DeepSeek Models | sid_svlbcrT5oQ / MMLU: 88.5 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | DeepSeek Models | sid_svlbcrT5oQ / MATH: 61.6 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | Claude Opus 4.1 | sid_dHgSM46fMw / GPQA Diamond: 80.9 | confirmed | 95% | 1 | Apr 24, 2026 | |
| Benchmark Result | Claude Opus 4.1 | sid_dHgSM46fMw / SWE-bench Verified: 74.5 | confirmed | 99% | 1 | Apr 24, 2026 | |
| Benchmark Result | Claude Haiku 4.5 | sid_y87VxEBBIA / SWE-bench Verified: 73.3 | confirmed | 98% | 1 | Apr 24, 2026 | |
| Citation | - | xAI - Footnote 12 | confirmed | 100% | 1 | Apr 3, 2026 | |
| Citation | - | Will MacAskill - Footnote 18 | confirmed | 100% | 1 | Apr 3, 2026 |
Data from source_check_verdicts table. Click a row to view detailed evidence.