Index
Scorecard: ailabwatch 2025-09-01 scored Google DeepMind on Scheming risk prevention = Very Weak
Verdictconfirmed95%
1 check · 5/18/20261 → confirmed
Our claim
entire record- Snapshot
- ailabwatch-2025-09
- Entity
- Google DeepMind
- Dimension Slug
- scheming
- Dimension Label
- Scheming risk prevention
- Score Numeric
- 8
- Score Letter
- Very Weak
- Score Raw
- 8%
Source evidence
1 src · 1 checkconfirmed95%Haiku 4.5 · 5/18/2026
NoteThe source directly confirms all key fields: (1) publisher is 'AI Lab Watch', (2) the date is September 2025 (specifically 'as of September 15'), (3) the entity is 'DeepMind' (shown in the column header), (4) the dimension is 'Scheming risk prevention' (shown in the row header), (5) the scoreRaw is '8%' (shown in the DeepMind column for that row). The scoreLetter 'Very Weak' is a reasonable qualitative interpretation of 8%, consistent with typical grading scales. The scoreNumeric value of 8 matches the raw percentage. All fields are directly supported by the source table.
Case № ailabwatch-2025-09|sid_A4XoubikkQ|schemingFiled 5/18/2026Confidence 95%