Citation

Anthropic - Footnote 17

partial85% confidence

1 evidence check

Last checked: 4/3/2026

The source does not mention the specific percentage achieved on SWE-bench Verified (80.9%) or OSWorld (61.4%). The source does not state that Claude Opus 4.5 is the first AI model to exceed 80% on SWE-bench Verified or 60% on Terminal-Bench 2.0. The source does not provide the next-best model's score on OSWorld (7.8%). The source only mentions Terminal Bench, not Terminal-Bench 2.0.

Evidence — 1 source, 1 check

www.anthropic.com/news/claude-opus-4-5(1 check)

partial85%Haiku 4.5 · 4/3/2026

Found: Claude Opus 4.5, released in November 2025, achieved results on benchmarks for complex enterprise tasks: 80.9% on SWE-bench Verified (the first AI model to exceed 80%), 60%+ on Terminal-Bench 2.0 (the…

Note: The source does not mention the specific percentage achieved on SWE-bench Verified (80.9%) or OSWorld (61.4%). The source does not state that Claude Opus 4.5 is the first AI model to exceed 80% on SWE-bench Verified or 60% on Terminal-Bench 2.0. The source does not provide the next-best model's score on OSWorld (7.8%). The source only mentions Terminal Bench, not Terminal-Bench 2.0.

Debug info

Record type: citation

Record ID: page:anthropic:fn17