Skip to content
Longterm Wiki
Index
Scorecard_grade·ailabwatch-2025-09|sid_A4XoubikkQ|planning·Record·Profile

Scorecard: ailabwatch 2025-09-01 scored Google DeepMind on Planning = Weak

Verdictconfirmed95%
1 check · 5/18/2026

1 → confirmed

Our claim

entire record
Snapshot
ailabwatch-2025-09
Entity
Google DeepMind
Dimension Slug
planning
Dimension Label
Planning
Score Numeric
26
Score Letter
Weak
Score Raw
26%

Source evidence

1 src · 1 check
confirmed95%Haiku 4.5 · 5/18/2026

NoteThe source is the AI Lab Watch scorecard itself, dated as of September 2025 (with an update note stating 'as of September 15'). The scorecard explicitly shows a Planning row with DeepMind scoring 26%. The claim states scoreRaw: 26% and scoreNumeric: 26, which matches the source exactly. The scoreLetter 'Weak' is a reasonable interpretation of a 26% score on a safety dimension. The publisher (ailabwatch), entity (Google DeepMind/DeepMind), dimension (Planning), and numeric score all align with the source data. The publishedAt date of 2025-09-01 is consistent with the source's September 2025 timeframe.

Case № ailabwatch-2025-09|sid_A4XoubikkQ|planningFiled 5/18/2026Confidence 95%
Source Check: Scorecard: ailabwatch 2025-09-01 scored Google DeepMind on Planning = Weak | Longterm Wiki