Skip to content
Longterm Wiki
Back

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: GitHub

A foundational DeepMind benchmark suite (2017) for evaluating RL agent safety properties; archived in 2023 but remains a standard reference for alignment researchers studying concrete safety failure modes in toy environments.

Metadata

Importance: 72/100tool pagetool

Summary

AI Safety Gridworlds is a suite of reinforcement learning environments from DeepMind designed to test and evaluate AI safety properties such as safe interruptibility, avoiding side effects, reward hacking, and distributional shift. Each gridworld scenario isolates a specific safety challenge, providing a standardized benchmark for safety research. The repository is now archived but remains a widely-cited foundational resource in the AI safety literature.

Key Points

  • Provides a collection of toy RL environments, each targeting a distinct AI safety problem (e.g., safe interruptibility, side-effect avoidance, reward gaming).
  • Includes a 'performance' vs. 'safety' reward distinction, allowing evaluation of agents on both task completion and safety criteria separately.
  • Accompanied by the paper 'AI Safety Gridworlds' (Leike et al., 2017), which formalizes several key safety desiderata for RL agents.
  • Archived in 2023 but still widely used as a benchmark and reference point in AI safety evaluation research.
  • Supports reproducible, minimal environments that make it easier to isolate and study individual alignment failure modes.

Cited by 1 page

PageTypeQuality
AI Knowledge MonopolyRisk50.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202613 KB
[Skip to content](https://github.com/google-deepmind/ai-safety-gridworlds#start-of-content)

You signed in with another tab or window. [Reload](https://github.com/google-deepmind/ai-safety-gridworlds) to refresh your session.You signed out in another tab or window. [Reload](https://github.com/google-deepmind/ai-safety-gridworlds) to refresh your session.You switched accounts on another tab or window. [Reload](https://github.com/google-deepmind/ai-safety-gridworlds) to refresh your session.Dismiss alert

{{ message }}

This repository was archived by the owner on Jul 21, 2023. It is now read-only.


[google-deepmind](https://github.com/google-deepmind)/ **[ai-safety-gridworlds](https://github.com/google-deepmind/ai-safety-gridworlds)** Public archive

- [Notifications](https://github.com/login?return_to=%2Fgoogle-deepmind%2Fai-safety-gridworlds) You must be signed in to change notification settings
- [Fork\\
125](https://github.com/login?return_to=%2Fgoogle-deepmind%2Fai-safety-gridworlds)
- [Star\\
630](https://github.com/login?return_to=%2Fgoogle-deepmind%2Fai-safety-gridworlds)


master

[**1** Branch](https://github.com/google-deepmind/ai-safety-gridworlds/branches) [**0** Tags](https://github.com/google-deepmind/ai-safety-gridworlds/tags)

[Go to Branches page](https://github.com/google-deepmind/ai-safety-gridworlds/branches)[Go to Tags page](https://github.com/google-deepmind/ai-safety-gridworlds/tags)

Go to file

Code

Open more actions menu

## Folders and files

| Name | Name | Last commit message | Last commit date |
| --- | --- | --- | --- |
| ## Latest commit<br>[![miljanm](https://avatars.githubusercontent.com/u/1547789?v=4&size=40)](https://github.com/miljanm)[miljanm](https://github.com/google-deepmind/ai-safety-gridworlds/commits?author=miljanm)<br>[Version 1.5](https://github.com/google-deepmind/ai-safety-gridworlds/commit/c43cb31143431421b5d2b661a2458efb301da9a3)<br>6 years agoOct 13, 2020<br>[c43cb31](https://github.com/google-deepmind/ai-safety-gridworlds/commit/c43cb31143431421b5d2b661a2458efb301da9a3) · 6 years agoOct 13, 2020<br>## History<br>[20 Commits](https://github.com/google-deepmind/ai-safety-gridworlds/commits/master/) <br>Open commit details<br>[View commit history for this file.](https://github.com/google-deepmind/ai-safety-gridworlds/commits/master/) 20 Commits |
| [ai\_safety\_gridworlds](https://github.com/google-deepmind/ai-safety-gridworlds/tree/master/ai_safety_gridworlds "ai_safety_gridworlds") | [ai\_safety\_gridworlds](https://github.com/google-deepmind/ai-safety-gridworlds/tree/master/ai_safety_gridworlds "ai_safety_gridworlds") | [Version 1.5](https://github.com/google-deepmind/ai-safety-gridworlds/commit/c43cb31143431421b5d2b661a2458efb301da9a3 "Version 1.5") | 6 years agoOct 13, 2020 |
| [.gitignore](https://github.com/google-deepmind/ai-safety-gridworlds/blob/master/.gitignore ".gitignore") | [.gitignore](https://github.com/google-deepmind/ai-safety-gridworlds/blob/master/.gitignore ".gitignore") | [

... (truncated, 13 KB total)
Resource ID: 64f41b0780d481a9 | Stable ID: ZDFkNWM2MD