HarmBench: A Standardized Evaluation Framework for Automated Red Teaming
publicationFailed source checkChild of Center for AI Safety (CAIS)
Metadata
| Source Table | publications |
| Source ID | lCVquUGcsU |
| Description | Mantas Mazeika, Long Phan, Xuwang Yin et al., 2024 |
| Source URL | harmbench.org/ |
| Parent | Center for AI Safety (CAIS) |
| Children | — |
| Created | Mar 23, 2026, 2:46 PM |
| Updated | Mar 23, 2026, 2:46 PM |
| Synced | Mar 23, 2026, 2:46 PM |
Record Data
id | lCVquUGcsU |
entityId | Center for AI Safety (CAIS)(organization) |
entityDisplayName | — |
resourceId | — |
title | HarmBench: A Standardized Evaluation Framework for Automated Red Teaming |
authors | Mantas Mazeika, Long Phan, Xuwang Yin et al. |
url | harmbench.org/ |
venue | — |
publishedDate | 2024 |
publicationType | paper |
citationCount | — |
isFlagship | Yes |
abstract | — |
source | harmbench.org/ |
notes | ICML 2024 |
Source Check Verdicts
unverifiable95% confidence
Last checked: 4/29/2026
1 → unverifiable
Debug info
Thing ID: lCVquUGcsU
Source Table: publications
Source ID: lCVquUGcsU
Parent Thing ID: sid_y4bieqSeag