Skip to content
Longterm Wiki

Red Teaming

Evaluationactive

Adversarial testing of AI systems to discover failure modes, safety issues, and vulnerabilities, both manual and automated.

Organizations
7
Grants
4
Total Funding
$198K
Risks Addressed
1
Cluster: Evaluation
Parent Area: AI Evaluations

Tags

red-teamingadversarialsafety-testing

Organizations3

OrganizationRole
Anthropicactive
Google DeepMindactive
OpenAIactive

Grants4

NameRecipientAmountFunderDate
UC Berkeley — AI Red-teaming BootcampUniversity of California, Berkeley$100KCoefficient Giving2025-01
Meta level adversarial evaluation of debate (scalable oversight technique) on simple math problems (MATS 5.0 project)Yoav Tzfati$62KLong-Term Future Fund (LTFF)2024-01
SoGive does EA analysis. We red-team EA analysis and conduct EA research to support donors. We also support EA talentSoGive$18KCentre for Effective Altruism2022-07
SoGive does EA analysis. We red-team EA analysis and conduct EA research to support donors. We also support EA talentSoGive$18KCentre for Effective Altruism2022-07

Funding by Funder