MIRI/Open Philanthropy exchange on decision theory

blog

2021·Alignment Forum·alignmentforum.org/posts/FBbHEjkZzdupcjkna/miri-op-exchan...

Author

Rob Bensinger

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: Alignment Forum

This exchange is a rare public record of institutional disagreement between MIRI and Open Philanthropy on decision theory, making it valuable for understanding the landscape of foundational agent-design debates in AI alignment research.

Metadata

Importance: 62/100blog postprimary source

Summary

This post documents a substantive dialogue between MIRI and Open Philanthropy researchers comparing decision theories (CDT, EDT, TDT, UDT, FDT) and their relevance to AI alignment. The exchange focuses on whether updateless decision theories outperform updateful variants on key philosophical dilemmas such as counterfactual mugging and Troll Bridge. It serves as a useful reference for understanding where these organizations agree and disagree on foundational decision-theoretic questions.

Key Points

•Clarifies distinctions between CDT, EDT, TDT, UDT, and FDT, providing a structured comparison of major decision theory frameworks.
•Debates whether updateless approaches (UDT, updateless FDT) systematically outperform updateful versions on canonical dilemmas like counterfactual mugging.
•Explores the Troll Bridge problem as a stress test for decision theories, highlighting edge cases where standard frameworks struggle.
•Reflects genuine disagreement between MIRI and Open Philanthropy researchers, making institutional perspectives on foundational AI alignment questions explicit.
•Relevant to AI alignment because the choice of decision theory for AI agents may have significant implications for their behavior in strategic or adversarial settings.

Cited by 1 page

Page	Type	Quality
Agent Foundations	Approach	59.0

Cached Content Preview

HTTP 200Fetched Apr 9, 202624 KB

Nov
 DEC
 Jan
 

 
 

 
 10
 
 

 
 

 2024
 2025
 2026
 

 
 
 

 

 

 
 
success

 
fail

 
 
 
 
 
 
 
 
 
 
 

 

 
 
 
 
 
 
 
 
 

 

 About this capture
 

 

 

 

 

 

 
COLLECTED BY

 

 

 
 
Collection: Common Crawl

 

 

 Web crawl data from Common Crawl.
 

 

 

 

 

 
TIMESTAMPS

 

 

 

 

 

 

The Wayback Machine - https://web.archive.org/web/20251210051102/https://www.alignmentforum.org/posts/FBbHEjkZzdupcjkna/miri-op-exchange-about-decision-theory-1

 

x

 This website requires javascript to properly function. Consider activating javascript to get access to all site functionality. 

AI ALIGNMENT FORUM

AF

Login

MIRI/OP exchange about decision theory — AI Alignment Forum

Decision theoryFunctional Decision TheoryCausal Decision TheoryEmbedded AgencyEvidential Decision TheoryAIRationality
Frontpage

22

MIRI/OP exchange about decision theory

by Rob Bensinger

25th Aug 2021

12 min read

7

22

Decision theoryFunctional Decision TheoryCausal Decision TheoryEmbedded AgencyEvidential Decision TheoryAIRationality
Frontpage

MIRI/OP exchange about decision theory
8riceissa

11Joe Carlsmith

4Ben Pace

3Daniel Kokotajlo

1Chris_Leong

New Comment

 

Submit

 

5 comments, sorted by 
top scoring
Click to highlight new comments since: Today at 5:11 AM

[-]riceissa4y

8

0

Rob, are you able to disclose why people at Open Phil are interested in learning more decision theory? It seems a little far away from the AI strategy reports they&#x27;ve been publishing in recent years, and it also seemed like they were happy to keep funding MIRI (via their Committee for Effective Altruism Support) despite disagreements about the value of HRAD research, so the sudden interest in decision theory is intriguing.

Reply

[-]Joe Carlsmith4y

11

0

Mostly personal interest on my part (I was working on a blog post on the topic, now up), though I do think that the topic has broader relevance.

Reply

[-]Ben Pace4y

4

0

I was in the chat and don&#x27;t have anything especially to "disclose". Joe and Nick are both academic philosophers who&#x27;ve studied at Oxford and been at FHI, with a wide range of interests. And Abram and Scott are naturally great people to chat about decision theory with when they&#x27;re available.

Reply

[-]Daniel Kokotajlo4y

3

0

My own answer would be the EDT answer: how much does your decision correlate with theirs? Modulated by ad-hoc updatelessness: how much does that correlation change if we forget "some" relevant information? (It usually increases a lot.) 

I found this part particularly interesting and would love to see a fleshed-out example of this reasoning so I can understand it better.

Reply

[-]Chris_Leong4y*

1

0

How would I in principle estimate how many more votes go to my favored presidential candidate in a presidential election (beyond the standard answer of "1")?

 

I&#x27;m happy to see Abram Demski mention this as I&#x27;ve long seen this as a crucial case for trying to understand subjunctive 

... (truncated, 24 KB total)

Resource ID: db5e810911f924b1 | Stable ID: sid_hgb3Thlxpx