Skip to content
Longterm Wiki
Back

Berkeley CHAI Research

web
humancompatible.ai·humancompatible.ai/research

CHAI at UC Berkeley is a leading academic AI safety research center; this page indexes their active research projects and is useful for tracking frontier academic work on alignment and human-compatible AI development.

Metadata

Importance: 72/100homepage

Summary

The Berkeley Center for Human-Compatible AI (CHAI) conducts foundational research on making AI systems that are safe and beneficial for humans. Their work focuses on value alignment, preference learning, and ensuring AI systems remain under meaningful human control. CHAI is one of the leading academic institutions dedicated to long-term AI safety research.

Key Points

  • CHAI focuses on technical AI alignment research, particularly on developing AI systems that learn and align with human values and preferences
  • Research areas include inverse reward design, cooperative AI, and formal methods for ensuring AI systems remain corrigible and controllable
  • Founded by Stuart Russell, CHAI emphasizes the 'assistance game' framework where AI systems are uncertain about human objectives
  • Produces both theoretical foundations and practical methods for building provably beneficial AI systems
  • Collaborates with policymakers and other institutions to bridge technical safety research with governance applications

Cited by 1 page

PageTypeQuality
AI Risk Interaction Network ModelAnalysis64.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202698 KB
CHAI aims to reorient the foundations of AI research toward the development of _provably beneficial systems_. Currently, it is not possible to specify a formula for human values in any form that we know would provably benefit humanity, if that formula were instated as the objective of a powerful AI system. In short, any initial formal specification of human values is bound to be wrong in important ways. This means we need to somehow represent _uncertainty_ in the objectives of AI systems. This way of formulating objectives stands in contrast to the standard model for AI, in which the AI system's objective is assumed to be known completely and correctly.

Therefore, much of CHAI's research efforts to date have focussed on developing and communicating a new model of AI development, in which AI systems should be uncertain of their objectives, and should be deferent to humans in light of that uncertainty. However, our interests extend to a variety of other problems in the development of provably beneficial AI systems. Our areas of greatest focus so far have been
the foundations of rational agency and causality,
value alignment and inverse reinforcement learning,
human-robot cooperation,
multi-agent perspectives and applications, and
models of bounded or imperfect rationality. Other areas of interest to our mission include
adversarial training and testing for ML systems,
various AI capabilities,
topics in cognitive science,
ethics for AI and AI development
robust inference and planning,
security problems and solutions, and
transparency and interpretability methods.


In addition to purely academic work, CHAI strives to produce intellectual outputs for general audiences as well. We also advise governments and international organizations on policies relevant to ensuring AI technologies will benefit society, and offer insight on a variety of individual-scale and societal-scale risks from AI, such as pertaining to autonomous weapons, the future of employment, and public health and safety.

Below is a list of CHAI's publications since we began operating in 2016. Many of our publications are collaborations with other AI research groups; we view collaborations as key to integrating our perspectives into mainstream AI research.

### 0\. NOTE:

#### 0\. NOTE:

- Jonathan Stray. 2022.
[Risk Ratios.](https://github.com/jstray/risk-ratios) _NICAR 2022_
- J Stray. 2022.
[Better Conflict Bulletin.](https://betterconflictbulletin.substack.com/)
- OA Dada, G Obaido, IT Sanusi, K Aruleba, AA Yunusa. 2022.
[Hidden Gold for IT Professionals, Educators, and Students: Insights From Stack Overflow Survey.](https://scholar.google.co.za/citations?view_op=view_citation&hl=en&user=cCXONVYAAAAJ&sortby=pubdate&citation_for_view=cCXONVYAAAAJ:9ZlFYXVOiuMC) _IEEE Transactions on Computational Social Systems_
- OA Dada, K Aruleba, AA Yunusa, IT Sanusi, G Obaido. 2022.
[Information Technology Roles and Their Most-Used Programming Languages.](https://scholar.google.co.za/citations?vie

... (truncated, 98 KB total)
Resource ID: b8668d08f397f100 | Stable ID: N2ViMTg1Zj