AI Safety Gridworlds
webCredibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: GitHub
A foundational DeepMind benchmark suite (2017) for evaluating RL agent safety properties; archived in 2023 but remains a standard reference for alignment researchers studying concrete safety failure modes in toy environments.
Metadata
Summary
AI Safety Gridworlds is a suite of reinforcement learning environments from DeepMind designed to test and evaluate AI safety properties such as safe interruptibility, avoiding side effects, reward hacking, and distributional shift. Each gridworld scenario isolates a specific safety challenge, providing a standardized benchmark for safety research. The repository is now archived but remains a widely-cited foundational resource in the AI safety literature.
Key Points
- •Provides a collection of toy RL environments, each targeting a distinct AI safety problem (e.g., safe interruptibility, side-effect avoidance, reward gaming).
- •Includes a 'performance' vs. 'safety' reward distinction, allowing evaluation of agents on both task completion and safety criteria separately.
- •Accompanied by the paper 'AI Safety Gridworlds' (Leike et al., 2017), which formalizes several key safety desiderata for RL agents.
- •Archived in 2023 but still widely used as a benchmark and reference point in AI safety evaluation research.
- •Supports reproducible, minimal environments that make it easier to isolate and study individual alignment failure modes.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| AI Knowledge Monopoly | Risk | 50.0 |
Cached Content Preview
[Skip to content](https://github.com/google-deepmind/ai-safety-gridworlds#start-of-content)
You signed in with another tab or window. [Reload](https://github.com/google-deepmind/ai-safety-gridworlds) to refresh your session.You signed out in another tab or window. [Reload](https://github.com/google-deepmind/ai-safety-gridworlds) to refresh your session.You switched accounts on another tab or window. [Reload](https://github.com/google-deepmind/ai-safety-gridworlds) to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 21, 2023. It is now read-only.
[google-deepmind](https://github.com/google-deepmind)/ **[ai-safety-gridworlds](https://github.com/google-deepmind/ai-safety-gridworlds)** Public archive
- [Notifications](https://github.com/login?return_to=%2Fgoogle-deepmind%2Fai-safety-gridworlds) You must be signed in to change notification settings
- [Fork\\
125](https://github.com/login?return_to=%2Fgoogle-deepmind%2Fai-safety-gridworlds)
- [Star\\
630](https://github.com/login?return_to=%2Fgoogle-deepmind%2Fai-safety-gridworlds)
master
[**1** Branch](https://github.com/google-deepmind/ai-safety-gridworlds/branches) [**0** Tags](https://github.com/google-deepmind/ai-safety-gridworlds/tags)
[Go to Branches page](https://github.com/google-deepmind/ai-safety-gridworlds/branches)[Go to Tags page](https://github.com/google-deepmind/ai-safety-gridworlds/tags)
Go to file
Code
Open more actions menu
## Folders and files
| Name | Name | Last commit message | Last commit date |
| --- | --- | --- | --- |
| ## Latest commit<br>[](https://github.com/miljanm)[miljanm](https://github.com/google-deepmind/ai-safety-gridworlds/commits?author=miljanm)<br>[Version 1.5](https://github.com/google-deepmind/ai-safety-gridworlds/commit/c43cb31143431421b5d2b661a2458efb301da9a3)<br>6 years agoOct 13, 2020<br>[c43cb31](https://github.com/google-deepmind/ai-safety-gridworlds/commit/c43cb31143431421b5d2b661a2458efb301da9a3) · 6 years agoOct 13, 2020<br>## History<br>[20 Commits](https://github.com/google-deepmind/ai-safety-gridworlds/commits/master/) <br>Open commit details<br>[View commit history for this file.](https://github.com/google-deepmind/ai-safety-gridworlds/commits/master/) 20 Commits |
| [ai\_safety\_gridworlds](https://github.com/google-deepmind/ai-safety-gridworlds/tree/master/ai_safety_gridworlds "ai_safety_gridworlds") | [ai\_safety\_gridworlds](https://github.com/google-deepmind/ai-safety-gridworlds/tree/master/ai_safety_gridworlds "ai_safety_gridworlds") | [Version 1.5](https://github.com/google-deepmind/ai-safety-gridworlds/commit/c43cb31143431421b5d2b661a2458efb301da9a3 "Version 1.5") | 6 years agoOct 13, 2020 |
| [.gitignore](https://github.com/google-deepmind/ai-safety-gridworlds/blob/master/.gitignore ".gitignore") | [.gitignore](https://github.com/google-deepmind/ai-safety-gridworlds/blob/master/.gitignore ".gitignore") | [
... (truncated, 13 KB total)64f41b0780d481a9 | Stable ID: ZDFkNWM2MD