无需备份控制器的连续时间非线性系统的安全加固学习控制

论文标题

无需备份控制器的连续时间非线性系统的安全加固学习控制

Safe reinforcement learning control for continuous-time nonlinear systems without a backup controller

论文作者

Bandyopadhyay, Soutrik, Bhasin, Shubhendu

论文摘要

本文提出了一种在式加固学习（RL）控制算法，该算法解决了在用户定义的状态约束下的一类不确定连续时间非线性系统的最佳调节问题。我们将安全的RL问题提出，因为哈密顿量对障碍莱普诺夫函数（BLF）的时间衍生的限制受到限制。随后，我们使用优化问题的分析解决方案来修改Actor-Critic-Identifier体系结构，以安全地学习最佳控制策略。所提出的方法不需要外部备份控制器，RL策略可确保整个持续时间的安全。在一类Euler-Lagrange系统上证明了所提出的控制器的功效。

This paper proposes an on-policy reinforcement learning (RL) control algorithm that solves the optimal regulation problem for a class of uncertain continuous-time nonlinear systems under user-defined state constraints. We formulate the safe RL problem as the minimization of the Hamiltonian subject to a constraint on the time-derivative of a barrier Lyapunov function (BLF). We subsequently use the analytical solution of the optimization problem to modify the Actor-Critic-Identifier architecture to learn the optimal control policy safely. The proposed method does not require the presence of external backup controllers, and the RL policy ensures safety for the entire duration. The efficacy of the proposed controller is demonstrated on a class of Euler-Lagrange systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题