Skip to content
Longterm Wiki

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming

publicationFailed source check

Metadata

Source Tablepublications
Source IDlCVquUGcsU
DescriptionMantas Mazeika, Long Phan, Xuwang Yin et al., 2024
Source URLharmbench.org/
ParentCenter for AI Safety (CAIS)
Children
CreatedMar 23, 2026, 2:46 PM
UpdatedMar 23, 2026, 2:46 PM
SyncedMar 23, 2026, 2:46 PM

Record Data

idlCVquUGcsU
entityIdCenter for AI Safety (CAIS)(organization)
entityDisplayName
resourceId
titleHarmBench: A Standardized Evaluation Framework for Automated Red Teaming
authorsMantas Mazeika, Long Phan, Xuwang Yin et al.
urlharmbench.org/
venue
publishedDate2024
publicationTypepaper
citationCount
isFlagshipYes
abstract
sourceharmbench.org/
notesICML 2024

Source Check Verdicts

unverifiable95% confidence

Last checked: 4/29/2026

1 → unverifiable

Debug info

Thing ID: lCVquUGcsU

Source Table: publications

Source ID: lCVquUGcsU

Parent Thing ID: sid_y4bieqSeag