[2205.10330] A Review of Safe Reinforcement Learning: Methods, Theory and Applications
paperAuthors
Credibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: arXiv
A useful survey for researchers entering safe RL; covers the gap between theoretical safety guarantees and practical deployment, relevant to AI safety work focused on ensuring RL agents behave safely in high-stakes environments like robotics and autonomous vehicles.
Paper Details
Metadata
Abstract
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-making tasks. However, safety concerns are raised during deploying RL in real-world applications, leading to a growing demand for safe RL algorithms, such as in autonomous driving and robotics scenarios. While safe control has a long history, the study of safe RL algorithms is still in the early stages. To establish a good foundation for future safe RL research, in this paper, we provide a review of safe RL from the perspectives of methods, theories, and applications. Firstly, we review the progress of safe RL from five dimensions and come up with five crucial problems for safe RL being deployed in real-world applications, coined as "2H3W". Secondly, we analyze the algorithm and theory progress from the perspectives of answering the "2H3W" problems. Particularly, the sample complexity of safe RL algorithms is reviewed and discussed, followed by an introduction to the applications and benchmarks of safe RL algorithms. Finally, we open the discussion of the challenging problems in safe RL, hoping to inspire future research on this thread. To advance the study of safe RL algorithms, we release an open-sourced repository containing the implementations of major safe RL algorithms at the link: https://github.com/chauncygu/Safe-Reinforcement-Learning-Baselines.git.
Summary
A comprehensive survey of safe reinforcement learning that organizes the field around five critical dimensions formalized as the '2H3W' framework (addressing How to define safety, How to ensure safety, and When/Where/Why safety matters). The paper reviews algorithmic progress, sample complexity theory, real-world applications in autonomous driving and robotics, and benchmarks, while releasing an open-source implementation repository of major safe RL algorithms.
Key Points
- •Introduces the '2H3W' framework organizing five crucial problems for deploying safe RL in real-world settings, covering safety definitions, enforcement mechanisms, and application contexts.
- •Reviews sample complexity of safe RL algorithms, providing theoretical grounding for understanding how much data is needed to learn safe policies.
- •Surveys constrained Markov decision process (CMDP) formulations, Lyapunov-based methods, shielding, and other approaches to enforcing safety constraints during learning.
- •Covers real-world application domains including autonomous driving and robotics, with a discussion of existing benchmarks for evaluating safe RL algorithms.
- •Releases an open-source repository implementing major safe RL algorithms, facilitating reproducible research and community benchmarking.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Deep Learning Revolution Era | Historical | 44.0 |
Cached Content Preview
[2205.10330] A Review of Safe Reinforcement Learning: Methods, Theory and Applications
-->
Computer Science > Artificial Intelligence
arXiv:2205.10330 (cs)
[Submitted on 20 May 2022 ( v1 ), last revised 24 May 2024 (this version, v5)]
Title: A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Authors: Shangding Gu , Long Yang , Yali Du , Guang Chen , Florian Walter , Jun Wang , Alois Knoll View a PDF of the paper titled A Review of Safe Reinforcement Learning: Methods, Theory and Applications, by Shangding Gu and 6 other authors
View PDF
HTML (experimental)
Abstract: Reinforcement Learning (RL) has achieved tremendous success in many complex decision-making tasks. However, safety concerns are raised during deploying RL in real-world applications, leading to a growing demand for safe RL algorithms, such as in autonomous driving and robotics scenarios. While safe control has a long history, the study of safe RL algorithms is still in the early stages. To establish a good foundation for future safe RL research, in this paper, we provide a review of safe RL from the perspectives of methods, theories, and applications. Firstly, we review the progress of safe RL from five dimensions and come up with five crucial problems for safe RL being deployed in real-world applications, coined as "2H3W". Secondly, we analyze the algorithm and theory progress from the perspectives of answering the "2H3W" problems. Particularly, the sample complexity of safe RL algorithms is reviewed and discussed, followed by an introduction to the applications and benchmarks of safe RL algorithms. Finally, we open the discussion of the challenging problems in safe RL, hoping to inspire future research on this thread. To advance the study of safe RL algorithms, we release an open-sourced repository containing the implementations of major safe RL algorithms at the link: this https URL .
Subjects:
Artificial Intelligence (cs.AI) ; Machine Learning (cs.LG)
Cite as:
arXiv:2205.10330 [cs.AI]
(or
arXiv:2205.10330v5 [cs.AI] for this version)
https://doi.org/10.48550/arXiv.2205.10330
Focus to learn more
arXiv-issued DOI via DataCite
Submission history
From: Shangding Gu [ view email ]
[v1]
Fri, 20 May 2022 17:42:38 UTC (26,152 KB)
[v2]
Mon, 23 May 2022 08:18:52 UTC (27,097 KB)
[v3]
Sat, 4 Jun 2022 17:03:49 UTC (27,091 KB)
[v4]
Mon, 20 Feb 2023 10:34:26 UTC (27,102 KB)
[v5]
Fri, 24 May 2024 22:33:04 UTC (27,157 KB)
Full-text links:
Access Paper:
View a PDF of the paper titled A Review of Safe Reinforcement Learning: Methods, Theory and Applications, by Shangding Gu and 6 other authors View PDF
HTML (experimental)
TeX Source
view license
Current browse context: cs.AI
< prev
|
next >
... (truncated, 6 KB total)1efe2b3ae47b8e1b | Stable ID: MjY0ZmM1Yz