FrontierMath

Math

Research-grade mathematics benchmark from Epoch AI featuring original problems created by professional mathematicians. Designed to remain unsaturated for years.

Models Tested

Best Score

4.5

Median Score

4.5

Scoring: accuracy

Introduced: 2024-11

Maintainer: Epoch AI

Leaderboard (1 model)

#	Model	Developer	Score
🥇	GPT-4.1 mini	OpenAI	4.5