Skip to content
Longterm Wiki
Back

Agent Foundations for Aligning Machine Intelligence

web

Author

Kolya T

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: MIRI

This is MIRI's official research guide, useful for understanding the agent-foundations approach to alignment and identifying open technical problems; best paired with MIRI's technical papers and the Embedded Agency sequence.

Metadata

Importance: 72/100homepagereference

Summary

MIRI's research guide outlines the theoretical foundations and open problems in agent-based AI alignment, focusing on decision theory, logical uncertainty, corrigibility, and related mathematical challenges. It provides a roadmap for researchers interested in contributing to foundational alignment work. The guide situates these problems within the broader goal of ensuring advanced AI systems remain safe and beneficial.

Key Points

  • Covers core MIRI research agendas including logical uncertainty, decision theory, and embedded agency problems.
  • Addresses corrigibility and the shutdown problem as central challenges for building safe, correctable AI agents.
  • Explores mesa-optimization and inner alignment risks arising from learned models pursuing unintended sub-goals.
  • Serves as an entry point for technically-minded researchers wanting to contribute to foundational AI safety work.
  • Connects formal agent foundations to broader alignment goals such as value learning and corrigible behavior.

Cited by 5 pages

Cached Content Preview

HTTP 200Fetched Mar 20, 202650 KB
[Skip to content](https://intelligence.org/research-guide/#content)

# A Guide to MIRI’s Research

_by Nate Soares_

**Update June 2022**: As noted in the 2019 update below, this research guide has only been lightly updated since 2015. We’re also currently doing less hiring (though not zero hiring), and are not currently running AIRCS workshops (though we may run more in the future).

If you’re interested in contributing to the alignment problem, we recommend starting with the [Alignment Research Field Guide](https://www.lesswrong.com/posts/PqMT9zGrNsGJNfiFR/alignment-research-field-guide), [How To Get Into Independent Research On Alignment/Agency](https://www.lesswrong.com/posts/P3Yt66Wh5g7SbkKuT/how-to-get-into-independent-research-on-alignment-agency), and the resources on the [Late 2021 MIRI Conversations](https://intelligence.org/late-2021-miri-conversations/) page.

If you have additional questions about how to get involved, we recommend contacting Buck Shlegeris of [Redwood Research](https://www.redwoodresearch.org/) or posting on [LessWrong](https://lesswrong.com/).

**Update March 2019**: This research guide has been only lightly updated since 2015. Our new recommendation for people who want to work on the [AI alignment problem](https://intelligence.org/research-guide/#ten) is:

- If you have a computer science or software engineering background: Apply to attend our new [workshops on AI risk](https://intelligence.org/ai-risk-for-computer-scientists/) and to [work as an engineer at MIRI](https://intelligence.org/engineers). For this purpose, you don’t need any prior familiarity with our research.
  - If you aren’t sure whether you’d be a good fit for an AI risk workshop, or for an engineer position, [shoot us an email](mailto:buck@intelligence.org) and we can talk about whether it makes sense.
  - You can find out more about our engineering program in our [2018 strategy update](https://intelligence.org/2018/11/22/2018-update-our-new-research-directions/).
- If you’d like to learn more about the problems we’re working on (regardless of your answer to the above): See “ [Embedded Agency](https://www.lesswrong.com/posts/i3BTagvt3HbPMx6PN/embedded-agency-full-text-version)” for an introduction to our agent foundations research, and see our [Alignment Research Field Guide](https://www.lesswrong.com/posts/PqMT9zGrNsGJNfiFR/alignment-research-field-guide) for general recommendations on how to get started in AI safety.
  - After checking out those two resources, you can use the links and references in “Embedded Agency” and on this page to learn more about the topics you want to drill down on. If you want a particular problem set to focus on, we suggest Scott Garrabrant’s “ [Fixed Point Exercises](https://www.lesswrong.com/posts/mojJ6Hpri8rfzY78b/fixed-point-exercises).” As Scott notes:

    > Sometimes people ask me what math they should study in order to get into agent foundations. My first answer is that I have found the introductory class in ev

... (truncated, 50 KB total)
Resource ID: ee872736d7fbfcd5 | Stable ID: NzQxOWIyMT