DocVQA
MultimodalDocument Visual Question Answering — a benchmark of 50,000 questions on 12,000+ document images, testing the ability to understand and extract information from real-world documents.
Models Tested
2
Best Score
94.4%
Median Score
94.4%
Scoring: accuracy
Introduced: 2020-07
Leaderboard2 models
| # | Model | Developer | Score |
|---|---|---|---|
| 🥇 | Llama 4 Scout | Meta AI (FAIR) | 94.4% |
| 🥈 | Llama 4 Maverick | Meta AI (FAIR) | 94.4% |