Skip to content
Longterm Wiki

Metadata

1 FactBase fact citing this source

Cached Content Preview

HTTP 200Fetched Apr 30, 202610 KB
Contact [evanjhub@gmail.com](mailto:evanjhub@gmail.com)
Information (925) 240-3826

alignmentforum.org/users/evhub
github.com/evhub

Education Harvey Mudd College, Claremont, CA
B.S. in Mathematics and Computer Science
High Distinction, Honors in Mathematics, Dean’s List

Graduated: May 2019
GPA: 3.912

The College Preparatory School, Oakland, CA

GPA: 3.912

Summary AI safety research fellow at the Machine Intelligence Research Institute. Previously did theoretical AI safety
research with Paul Christiano at OpenAI. Machine learning research experience working for HRL Laboratories and the Music Information Retrieval Lab at Harvey Mudd College. Professional software engineering
experience at Google, Yelp, and Ripple. Author of the Coconut programming language.

Papers An overview of 11 proposals for building safe advanced AI
Evan Hubinger
A comparative analysis of 11 dierent, leading proposals for building safe advanced AI under the current ma-

May 2020
arxiv.org/abs/2012.07532

arxiv.org/abs/2012.07532
A comparative analysis of 11 dierent, leading proposals for building safe advanced AI under the current ma-

A comparative analysis of 11 dierent, leading proposals for building safe advanced AI under the current machine learning paradigm. Analyzes each proposal on the four components of outer alignment, inner alignment,
training competitiveness, and performance competitiveness.

Risks from Learned Optimization in Advanced Machine Learning Systems June 2019
Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, Scott Garrabrant
arxiv.org/abs/1906.01820
Introduces the concept and the potential dangers of inner alignment and mesa-optimization, specically

Introduces the concept and the potential dangers of inner alignment and mesa-optimization, specically
discussing when a trained model might be an optimizer and how it might be misaligned.

Research Research Fellow
Experience Machine Intelligence Research Institute, Berkeley, CA

November 2019 { Present

Experience Machine Intelligence Research Institute, Berkeley, CA
Wrote \\An overview of 11 proposals for building safe advanced AI," as detailed above.

Wrote \\An overview of 11 proposals for building safe advanced AI," as detailed above.
Produced two new alternativeAI safety via debateproposals, \\AI Safety via Market Making" and \\Syn-

Produced two new alternativeAI safety via debateproposals, \\AI Safety via Market Making" and \\Synthesizing Amplication and Debate.
Analyzed dierent alignment proposals from a computational complexity standpoint as in \\AI safety via

Analyzed dierent alignment proposals from a computational complexity standpoint as in \\AI safety via
debate," resulting in \\Alignment Proposals and Complexity Classes" and \\Weak HCH Accesses EXP."
Produceda wide variety of additional work on the Alignment Forum, including \\Gradient hacking," \\Chris

Produceda wide variety of additional work on the Alignment Forum, including \\Gradient hacking," \\Chris
Olah’s views on

... (truncated, 10 KB total)
Resource ID: e8b330ed5e1beb99 | Stable ID: sid_3UtoAWTCef