DROP

Reasoning

Discrete Reasoning Over Paragraphs — a reading comprehension benchmark requiring numerical reasoning operations such as addition, counting, and sorting over text passages.

Models Tested

4

Best Score

92.2%

Median Score

89.35%

Scoring: accuracy

Introduced: 2019-03

Maintainer: AI2

Leaderboard (4 models)

#	Model	Developer	Score
🥇	DeepSeek R1	DeepSeek	92.2%
🥈	DeepSeek Models	DeepSeek	91.6
🥉	Claude 3.5 Sonnet	Anthropic	87.1
4	GPT-3.5 Turbo	OpenAI	61.4