Redwood Research

Safety Organization

Founded Jun 2021 (4 years old)HQ: San Francisco, CAredwoodresearch.org ↗

Also known as: Redwood

Entity

About

People3 Timeline7 Divisions1

Business

Grants Received6 Market Data8

Policy & Governance

Policy Positions1

Output & Research

Publications2 Announcements5

Data

Research & Technical Papers (2)


Redwood Research, 2024 Redwood Research's AI Control research program focuses on developing techniques to ensure AI systems behave safely even if they are misaligned or adversarially inclined, by building robust oversight and control mechanisms rather than relying solely on alignment. The approach emphasizes empirically evaluating whether safety measures hold up against a red-teamed 'untrusted' AI attempting to subvert them. This represents a complementary strategy to alignment research, treating safety as an engineering and evaluation problem.	paper	redwoodresearch.org	2		↗
Research Causal Scrubbing is a methodology developed by Redwood Research for evaluating mechanistic interpretability hypotheses about neural networks. It provides a principled, algorithmic approach to test whether a proposed explanation of a model's computation is correct by systematically replacing activations and measuring the impact on model behavior. This framework helps researchers rigorously validate or falsify circuit-level interpretability claims.	paper	redwoodresearch.org	3		↗