GitHub - openai/prm800k: 800,000 step-level correctness labels on LLM solutions to MATH problems · GitHub
webCredibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: GitHub
This dataset is the empirical foundation for OpenAI's work on process supervision vs. outcome supervision; closely related to debates about scalable oversight, recursive reward modeling, and catching reasoning errors in capable AI systems.
Metadata
Summary
PRM800K is a dataset released by OpenAI containing 800,000 step-level human correctness labels on large language model solutions to MATH competition problems. It supports training and evaluating process reward models (PRMs), which provide feedback on individual reasoning steps rather than final answers. This dataset underpins research into process supervision as a method for improving LLM reasoning reliability and safety.
Key Points
- •Contains 800,000 step-level correctness labels on LLM-generated solutions to MATH benchmark problems, enabling granular supervision of reasoning chains.
- •Supports training Process Reward Models (PRMs) that score each step of a solution, rather than only the final outcome.
- •Process supervision is proposed as a safer and more effective alternative to outcome supervision for catching subtle reasoning errors.
- •Released alongside OpenAI's research showing PRMs outperform outcome reward models on competitive math problem-solving benchmarks.
- •Relevant to AI safety as step-level feedback can help detect and reduce deceptive or flawed reasoning in LLMs.
Cited by 2 pages
| Page | Type | Quality |
|---|---|---|
| Process Supervision | Approach | 65.0 |
| Scalable Oversight | Research Area | 68.0 |
Cached Content Preview
[Skip to content](https://github.com/openai/prm800k#start-of-content)
You signed in with another tab or window. [Reload](https://github.com/openai/prm800k) to refresh your session.You signed out in another tab or window. [Reload](https://github.com/openai/prm800k) to refresh your session.You switched accounts on another tab or window. [Reload](https://github.com/openai/prm800k) to refresh your session.Dismiss alert
{{ message }}
[openai](https://github.com/openai)/ **[prm800k](https://github.com/openai/prm800k)** Public
- [Notifications](https://github.com/login?return_to=%2Fopenai%2Fprm800k) You must be signed in to change notification settings
- [Fork\\
125](https://github.com/login?return_to=%2Fopenai%2Fprm800k)
- [Star\\
2.1k](https://github.com/login?return_to=%2Fopenai%2Fprm800k)
main
[**1** Branch](https://github.com/openai/prm800k/branches) [**0** Tags](https://github.com/openai/prm800k/tags)
[Go to Branches page](https://github.com/openai/prm800k/branches)[Go to Tags page](https://github.com/openai/prm800k/tags)
Go to file
Code
Open more actions menu
## Folders and files
| Name | Name | Last commit message | Last commit date |
| --- | --- | --- | --- |
| ## Latest commit<br>[](https://github.com/Huntrr)[Huntrr](https://github.com/openai/prm800k/commits?author=Huntrr)<br>[Add arXiv link](https://github.com/openai/prm800k/commit/7ecc794703b2877f63226f2477a49b34f9b25163)<br>3 years agoJun 1, 2023<br>[7ecc794](https://github.com/openai/prm800k/commit/7ecc794703b2877f63226f2477a49b34f9b25163) · 3 years agoJun 1, 2023<br>## History<br>[2 Commits](https://github.com/openai/prm800k/commits/main/) <br>Open commit details<br>[View commit history for this file.](https://github.com/openai/prm800k/commits/main/) 2 Commits |
| [prm800k](https://github.com/openai/prm800k/tree/main/prm800k "prm800k") | [prm800k](https://github.com/openai/prm800k/tree/main/prm800k "prm800k") | [prm800k](https://github.com/openai/prm800k/commit/00811d6de065642a6967b9017d4cee59550c0ef4 "prm800k") | 3 years agoMay 30, 2023 |
| [.gitattributes](https://github.com/openai/prm800k/blob/main/.gitattributes ".gitattributes") | [.gitattributes](https://github.com/openai/prm800k/blob/main/.gitattributes ".gitattributes") | [prm800k](https://github.com/openai/prm800k/commit/00811d6de065642a6967b9017d4cee59550c0ef4 "prm800k") | 3 years agoMay 30, 2023 |
| [LICENSE](https://github.com/openai/prm800k/blob/main/LICENSE "LICENSE") | [LICENSE](https://github.com/openai/prm800k/blob/main/LICENSE "LICENSE") | [prm800k](https://github.com/openai/prm800k/commit/00811d6de065642a6967b9017d4cee59550c0ef4 "prm800k") | 3 years agoMay 30, 2023 |
| [README.md](https://github.com/openai/prm800k/blob/main/README.md "README.md") | [README.md](https://github.com/openai/prm800k/blob/main/README.md "README.md") | [Add arXiv link](https://github.com/openai/prm800k/commit/7ecc794703b2877f63226f2477a49b34f9b25163 "Add arXiv li
... (truncated, 17 KB total)eccb4758de07641b | Stable ID: YTZhN2Q5Yj