DocVQA

Multimodal

Document Visual Question Answering — a benchmark of 50,000 questions on 12,000+ document images, testing the ability to understand and extract information from real-world documents.