Back
Model weight leaderboards
webhuggingface.co·huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
This archived leaderboard was a widely-cited reference for comparing open-source LLM capabilities; useful context for understanding how the field tracked model progress and the limitations of static benchmark-based evaluation.
Metadata
Importance: 52/100tool pagetool
Summary
The Open LLM Leaderboard is a HuggingFace-hosted benchmarking platform that compares open-source large language models across standardized evaluations in a transparent and reproducible manner. It allows researchers and practitioners to filter, search, and rank models by performance metrics, providing a community reference for tracking AI capabilities progress. The leaderboard has since been archived, reflecting the rapid pace of LLM development.
Key Points
- •Provides standardized, reproducible benchmarking of open-source LLMs across common evaluation tasks.
- •Supports advanced search including regex and multi-term filtering for granular model comparison.
- •Serves as a community reference point for tracking the frontier of open-source AI capabilities over time.
- •Now archived, indicating the original benchmark suite may have become saturated or superseded by newer evaluations.
- •Relevant to AI safety discussions around capability elicitation, evaluation methodology, and tracking model progress.
Cited by 2 pages
| Page | Type | Quality |
|---|---|---|
| AI Proliferation Risk Model | Analysis | 65.0 |
| AI Risk Activation Timeline Model | Analysis | 66.0 |
Cached Content Preview
HTTP 200Fetched Mar 20, 20261 KB
Fetching metadata from the HF Docker repository... Refreshing Open LLM Leaderboard - Compare Open Source Large Language Models hg-logo # Open LLM Leaderboard Archived ###### Comparing Large Language Models in an open and reproducible way Supports strict search and regex • Use semicolons for multiple terms Quick Filters | | | | | | | | | | | | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Resource ID:
519d45a8450736f6 | Stable ID: ODEyMmM4MT