Benchmarking
AI Benchmarking
Standardized evaluations for measuring AI capabilities and safety properties
This page is a stub. Content needed.
Standardized evaluations for measuring AI capabilities and safety properties
This page is a stub. Content needed.
AI safety research nonprofit founded in 2022 by Adam Gleave and Karl Berzins, focusing on making AI systems safe through technical research and coo...
US government agency for AI safety research and standard-setting under NIST, established November 2023 with $10M initial budget (FY2025 request of ...
This model analyzes when safety measures conflict with capabilities. It finds most safety interventions impose 5-15% capability cost, with some ach...
Quantitative measures tracking AI model performance across language, coding, and multimodal benchmarks from 2020-2025, showing rapid progress with ...
AI systems' understanding of their own nature and circumstances, representing a critical threshold capability that enables strategic deception and ...