Skip to content
Longterm Wiki
Search
Entities
Research
Policy
Sources
FactBase
About
Internal
Search
⌘K
Benchmarks
/
BrowseComp
BrowseComp
Agentic
Wiki page
Data
A benchmark evaluating AI systems' ability to find hard-to-locate information on the web, testing browsing, search, and information synthesis capabilities across difficult queries.
Models Tested
1
Best Score
84
Median Score
84
Scoring:
accuracy
Introduced:
2025-04
Maintainer:
OpenAI
Leaderboard
(1 model)
#
Model
Developer
Score
🥇
Claude Opus 4.6
Anthropic
84