Anthropic Alignment

Company BlogHigh(4)

Anthropic's alignment research portal

Resources

Citing pages

Tracked domains

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

alignment.anthropic.com

12 resources

		Authors	Summary
Anthropic: Recommended Directions for AI Safety Research	web	—	S	10
Anthropic-OpenAI joint evaluation	web	—	S	9
Anthropic Alignment Science Blog	web	—	S	8
Bloom: Automated Behavioral Evaluations	web	—	S	4
Anthropic Fellows Program	web	—	S	4
Anthropic Fellows Program	web	—	S	2
Alignment Faking Mitigations - Anthropic	web	—	S	2
10-42% correct root cause identification	web	—	S	2
Automated Researchers Can Subtly Sandbag	web	—	S	1
Subliminal Learning - Anthropic Alignment	web	—	S	1
Evaluating Honesty and Lie Detection Techniques on a Diverse Suite of Dishonest Models	web	—	S	1
Cost-Effective Constitutional Classifiers	web	—	S	0

Publication ID: anthropic-alignment