Eliciting Latent Knowledge

Scalable Oversightactive

Extracting what an AI model 'actually believes' rather than what it says, addressing the distinction between model knowledge and model outputs.

Organizations

Key Papers

Grants

Total Funding

$181K

First Proposed: 2022 (Christiano et al., ARC)

Cluster: Scalable Oversight

Organizations1

Organization	Role
Alignment Research Center (ARC)	pioneer

Name	Recipient	Amount	Funder	Date
Grant to "support a competition for work on Eliciting Latent Knowledge, an open problem in AI alignment, for talented high school and college students who are participating in Prometheus Science Bowl."	Prometheus Science Bowl	$100K	FTX Future Fund	2022-05
A research & networking retreat for winners of the Eliciting Latent Knowledge contest	-	$72K	Long-Term Future Fund (LTFF)	2022-10
3 months relocation from Chad to London to work on Eliciting Latent Knowledge with Jake Mendel from Apollo Research	Sienka Dounia	$8.5K	Long-Term Future Fund (LTFF)	2024-01

Funder	Grants	Total Amount
FTX Future Fund	1	$100K
Long-Term Future Fund (LTFF)	2	$81K

SEMINAL

Christiano et al. (ARC)2021