Skip to content
Longterm Wiki

Claude 3.7 Sonnet on HumanEval: 94

benchmark-resultVerified

Child of HumanEval

Metadata

Source Tablebenchmark_results
Source ID8sHFVW5tzI
ParentHumanEval
Children
CreatedApr 24, 2026, 7:37 PM
UpdatedApr 24, 2026, 7:37 PM
SyncedApr 24, 2026, 7:37 PM

Record Data

id8sHFVW5tzI
benchmarkIdvxX2rorgxU
modelIdClaude 3.7 Sonnet(ai-model)
score94
unitpercent
date2025-02-24
sourceUrl
notesHumanEval pass rate
testedByunknown
testedByOrgId
evaluationDate
methodologyNotes

Source Check Verdicts

confirmed99% confidence

Last checked: 4/24/2026

Inline sourcing: confirmed

Debug info

Thing ID: 8sHFVW5tzI

Source Table: benchmark_results

Source ID: 8sHFVW5tzI

Parent Thing ID: vxX2rorgxU