论文标题

对安全强化学习的评论:方法,理论和应用

A Review of Safe Reinforcement Learning: Methods, Theory and Applications

论文作者

Gu, Shangding, Yang, Long, Du, Yali, Chen, Guang, Walter, Florian, Wang, Jun, Knoll, Alois

论文摘要

强化学习(RL)在许多复杂的决策任务中取得了巨大的成功。但是,在现实世界应用中部署RL期间提出了安全问题,从而导致对安全RL算法的需求不断增长,例如在自动驾驶和机器人方案中。尽管安全控制历史悠久,但对安全RL算法的研究仍处于早期阶段。为了为未来的安全RL研究建立良好的基础,在本文中,我们从方法,理论和应用程序的角度提供了对安全RL的审查。首先,我们回顾了从五个维度中的安全RL的进度,并提出了五个至关重要的问题,用于将安全RL部署在现实世界应用程序中,以“ 2H3W”。其次,我们从回答“ 2H3W”问题的角度分析了算法和理论的进步。特别是,对安全RL算法的样本复杂性进行了审查和讨论,然后对安全RL算法的应用和基准进行了介绍。最后,我们在安全RL中对具有挑战性的问题进行讨论,希望激发对此线程的未来研究。为了推进对安全RL算法的研究,我们发布了一个开源存储库,该存储库包含链接上主要安全RL算法的实现:https://github.com/chauncygu/safe-reinforception-learning-learning-learning-learning-baselines.git。

Reinforcement Learning (RL) has achieved tremendous success in many complex decision-making tasks. However, safety concerns are raised during deploying RL in real-world applications, leading to a growing demand for safe RL algorithms, such as in autonomous driving and robotics scenarios. While safe control has a long history, the study of safe RL algorithms is still in the early stages. To establish a good foundation for future safe RL research, in this paper, we provide a review of safe RL from the perspectives of methods, theories, and applications. Firstly, we review the progress of safe RL from five dimensions and come up with five crucial problems for safe RL being deployed in real-world applications, coined as "2H3W". Secondly, we analyze the algorithm and theory progress from the perspectives of answering the "2H3W" problems. Particularly, the sample complexity of safe RL algorithms is reviewed and discussed, followed by an introduction to the applications and benchmarks of safe RL algorithms. Finally, we open the discussion of the challenging problems in safe RL, hoping to inspire future research on this thread. To advance the study of safe RL algorithms, we release an open-sourced repository containing the implementations of major safe RL algorithms at the link: https://github.com/chauncygu/Safe-Reinforcement-Learning-Baselines.git.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源