Skip to content
Longterm Wiki
Back

GitHub - openai/prm800k: 800,000 step-level correctness labels on LLM solutions to MATH problems · GitHub

web

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: GitHub

This dataset is the empirical foundation for OpenAI's work on process supervision vs. outcome supervision; closely related to debates about scalable oversight, recursive reward modeling, and catching reasoning errors in capable AI systems.

Metadata

Importance: 72/100dataset

Summary

PRM800K is a dataset released by OpenAI containing 800,000 step-level human correctness labels on large language model solutions to MATH competition problems. It supports training and evaluating process reward models (PRMs), which provide feedback on individual reasoning steps rather than final answers. This dataset underpins research into process supervision as a method for improving LLM reasoning reliability and safety.

Key Points

  • Contains 800,000 step-level correctness labels on LLM-generated solutions to MATH benchmark problems, enabling granular supervision of reasoning chains.
  • Supports training Process Reward Models (PRMs) that score each step of a solution, rather than only the final outcome.
  • Process supervision is proposed as a safer and more effective alternative to outcome supervision for catching subtle reasoning errors.
  • Released alongside OpenAI's research showing PRMs outperform outcome reward models on competitive math problem-solving benchmarks.
  • Relevant to AI safety as step-level feedback can help detect and reduce deceptive or flawed reasoning in LLMs.

Cited by 2 pages

PageTypeQuality
Process SupervisionApproach65.0
Scalable OversightResearch Area68.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202617 KB
[Skip to content](https://github.com/openai/prm800k#start-of-content)

You signed in with another tab or window. [Reload](https://github.com/openai/prm800k) to refresh your session.You signed out in another tab or window. [Reload](https://github.com/openai/prm800k) to refresh your session.You switched accounts on another tab or window. [Reload](https://github.com/openai/prm800k) to refresh your session.Dismiss alert

{{ message }}

[openai](https://github.com/openai)/ **[prm800k](https://github.com/openai/prm800k)** Public

- [Notifications](https://github.com/login?return_to=%2Fopenai%2Fprm800k) You must be signed in to change notification settings
- [Fork\\
125](https://github.com/login?return_to=%2Fopenai%2Fprm800k)
- [Star\\
2.1k](https://github.com/login?return_to=%2Fopenai%2Fprm800k)


main

[**1** Branch](https://github.com/openai/prm800k/branches) [**0** Tags](https://github.com/openai/prm800k/tags)

[Go to Branches page](https://github.com/openai/prm800k/branches)[Go to Tags page](https://github.com/openai/prm800k/tags)

Go to file

Code

Open more actions menu

## Folders and files

| Name | Name | Last commit message | Last commit date |
| --- | --- | --- | --- |
| ## Latest commit<br>[![Huntrr](https://avatars.githubusercontent.com/u/6075553?v=4&size=40)](https://github.com/Huntrr)[Huntrr](https://github.com/openai/prm800k/commits?author=Huntrr)<br>[Add arXiv link](https://github.com/openai/prm800k/commit/7ecc794703b2877f63226f2477a49b34f9b25163)<br>3 years agoJun 1, 2023<br>[7ecc794](https://github.com/openai/prm800k/commit/7ecc794703b2877f63226f2477a49b34f9b25163) · 3 years agoJun 1, 2023<br>## History<br>[2 Commits](https://github.com/openai/prm800k/commits/main/) <br>Open commit details<br>[View commit history for this file.](https://github.com/openai/prm800k/commits/main/) 2 Commits |
| [prm800k](https://github.com/openai/prm800k/tree/main/prm800k "prm800k") | [prm800k](https://github.com/openai/prm800k/tree/main/prm800k "prm800k") | [prm800k](https://github.com/openai/prm800k/commit/00811d6de065642a6967b9017d4cee59550c0ef4 "prm800k") | 3 years agoMay 30, 2023 |
| [.gitattributes](https://github.com/openai/prm800k/blob/main/.gitattributes ".gitattributes") | [.gitattributes](https://github.com/openai/prm800k/blob/main/.gitattributes ".gitattributes") | [prm800k](https://github.com/openai/prm800k/commit/00811d6de065642a6967b9017d4cee59550c0ef4 "prm800k") | 3 years agoMay 30, 2023 |
| [LICENSE](https://github.com/openai/prm800k/blob/main/LICENSE "LICENSE") | [LICENSE](https://github.com/openai/prm800k/blob/main/LICENSE "LICENSE") | [prm800k](https://github.com/openai/prm800k/commit/00811d6de065642a6967b9017d4cee59550c0ef4 "prm800k") | 3 years agoMay 30, 2023 |
| [README.md](https://github.com/openai/prm800k/blob/main/README.md "README.md") | [README.md](https://github.com/openai/prm800k/blob/main/README.md "README.md") | [Add arXiv link](https://github.com/openai/prm800k/commit/7ecc794703b2877f63226f2477a49b34f9b25163 "Add arXiv li

... (truncated, 17 KB total)
Resource ID: eccb4758de07641b | Stable ID: YTZhN2Q5Yj