Skip to content
Longterm Wiki

Claude Opus 4.5 on GSM8K: 95

benchmark-resultVerified

Child of GSM8K

Metadata

Source Tablebenchmark_results
Source ID1PqZPHUnMt
Source URLautomatio.ai/models/claude-opus-4-5
ParentGSM8K
Children
CreatedApr 24, 2026, 7:48 PM
UpdatedApr 24, 2026, 7:48 PM
SyncedApr 24, 2026, 7:48 PM

Record Data

id1PqZPHUnMt
benchmarkIdfjjBrOI3p2
modelIdClaude Opus 4.5(ai-model)
score95
unitpercent
date2025-11-24
sourceUrlautomatio.ai/models/claude-opus-4-5
notesGSM8K - Grade school math reasoning
testedByunknown
testedByOrgId
evaluationDate
methodologyNotes

Source Check Verdicts

confirmed99% confidence

Last checked: 4/29/2026

1 → confirmed

Debug info

Thing ID: 1PqZPHUnMt

Source Table: benchmark_results

Source ID: 1PqZPHUnMt

Parent Thing ID: fjjBrOI3p2