Index
Scorecard: ailabwatch 2025-09-01 scored Google DeepMind on Planning = Weak
Verdictconfirmed95%
1 check · 5/18/20261 → confirmed
Our claim
entire record- Snapshot
- ailabwatch-2025-09
- Entity
- Google DeepMind
- Dimension Slug
- planning
- Dimension Label
- Planning
- Score Numeric
- 26
- Score Letter
- Weak
- Score Raw
- 26%
Source evidence
1 src · 1 checkconfirmed95%Haiku 4.5 · 5/18/2026
NoteThe source is the AI Lab Watch scorecard itself, dated as of September 2025 (with an update note stating 'as of September 15'). The scorecard explicitly shows a Planning row with DeepMind scoring 26%. The claim states scoreRaw: 26% and scoreNumeric: 26, which matches the source exactly. The scoreLetter 'Weak' is a reasonable interpretation of a 26% score on a safety dimension. The publisher (ailabwatch), entity (Google DeepMind/DeepMind), dimension (Planning), and numeric score all align with the source data. The publishedAt date of 2025-09-01 is consistent with the source's September 2025 timeframe.
Case № ailabwatch-2025-09|sid_A4XoubikkQ|planningFiled 5/18/2026Confidence 95%