Skip to content
Longterm Wiki

Scheming Detection

Evaluationemerging

Research on detecting when AI systems are engaged in deceptive alignment or strategic manipulation of their training process.

Organizations
4
Grants
1
Total Funding
$27K
Cluster: Evaluation
Parent Area: AI Evaluations

Tags

evaluationsdeceptiondeceptive-alignment

Grants1

NameRecipientAmountFunderDate
4-month grant to conduct deceptive alignment evaluation research and explore control and mitigation strategiesKai Fronsdal$27KLong-Term Future Fund (LTFF)2024-07

Funding by Funder

FunderGrantsTotal Amount
Long-Term Future Fund (LTFF)1$27K