Skip to content
Longterm Wiki

Constitutional AI

Alignment Trainingactive

Training approach where AI systems critique and revise their own outputs using a set of principles, reducing reliance on human feedback.

Organizations
1
Key Papers
1
Grants
1
Total Funding
$42K
First Proposed: 2022 (Bai et al., Anthropic)
Cluster: Alignment Training

Tags

trainingself-supervisionalignmentanthropic

Organizations1

OrganizationRole
Anthropicpioneer

Grants1

NameRecipientAmountFunderDate
6-month 1 FTE funding to train Multi-Objective RLAIF models and compare their safety performance to standard RLAIFMarcus Williams$42KLong-Term Future Fund (LTFF)2023-10

Funding by Funder

FunderGrantsTotal Amount
Long-Term Future Fund (LTFF)1$42K

Key Papers & Resources1

SEMINAL