论文标题
在神经网络政策中执行强大的控制保证
Enforcing robust control guarantees within neural network policies
论文作者
论文摘要
在为安全至关重要的系统设计控制器时,从业人员通常会在稳健性和绩效之间面临挑战性的权衡。尽管在某些最严重的案例干扰下,稳健的控制方法可为系统稳定性提供严格的保证,但它们通常会产生简单的控制器,而在平均(非差异)情况下的性能很差。相比之下,使用深度学习训练的非线性控制方法已经在许多控制任务上实现了最先进的表现,但通常缺乏坚固性的保证。在本文中,我们提出了一种结合了这两种方法的优势的技术:构建由神经网络参数化的通用非线性控制策略类别,尽管如此,它仍然可以执行与鲁棒控制相同的可证明的鲁棒性标准。具体而言,我们的方法需要将基于自定义凸优化的投影层整合到基于神经网络的策略中。我们证明了这种方法在多个域上的功能,从而在现有的稳健控制方法上提高了平均值性能,并且在最差的案例稳定性方面,对(非体)深度RL方法。
When designing controllers for safety-critical systems, practitioners often face a challenging tradeoff between robustness and performance. While robust control methods provide rigorous guarantees on system stability under certain worst-case disturbances, they often yield simple controllers that perform poorly in the average (non-worst) case. In contrast, nonlinear control methods trained using deep learning have achieved state-of-the-art performance on many control tasks, but often lack robustness guarantees. In this paper, we propose a technique that combines the strengths of these two approaches: constructing a generic nonlinear control policy class, parameterized by neural networks, that nonetheless enforces the same provable robustness criteria as robust control. Specifically, our approach entails integrating custom convex-optimization-based projection layers into a neural network-based policy. We demonstrate the power of this approach on several domains, improving in average-case performance over existing robust control methods and in worst-case stability over (non-robust) deep RL methods.