Skip to content
Longterm Wiki

LiveCodeBench

Coding
A contamination-free coding benchmark using new competitive programming problems from LeetCode, AtCoder, and Codeforces. Problems are refreshed continuously to prevent data leakage.
Models Tested
12
Best Score
79.4
Median Score
65.65
Scoring: pass_at_1
Introduced: 2024-06
Maintainer: LiveCodeBench Team

Leaderboard (12 models)

#ModelDeveloperScore
🥇GrokxAI
79.4
🥈Grok-3xAI
79.4%
🥉o3OpenAI
71.7%
4Claude Opus 4.5Anthropic
70.3
5o4-miniOpenAI
67.8%
6DeepSeek R1DeepSeek
65.9%
7Claude 3.7 SonnetAnthropic
65.4
8Gemini 2.5 ProGoogle DeepMind
63.4%
9o3-miniOpenAI
57.6%
10Llama 4 MaverickMeta AI (FAIR)
43.4%
11DeepSeek V3DeepSeek
40.5%
12Llama 4 ScoutMeta AI (FAIR)
32.8%