技术报告：使用式加固学习的可线化系统的自适应控制

论文标题

技术报告：使用式加固学习的可线化系统的自适应控制

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning

论文作者

Westenbroek, Tyler, Mazumdar, Eric, Fridovich-Keil, David, Prabhu, Valmik, Tomlin, Claire J., Sastry, S. Shankar

论文摘要

本文提出了一个框架，用于使用离散的无模型策略级参数更新规则自适应地学习基于反馈的基于反馈的跟踪控制器。该方案比标准模型 - 引用自适应控制技术的主要优点是，它不需要在所有时间实例中都可以逆转模型。这使使用一般函数近似器可以近似系统的线性化控制器，而不必担心奇异性。但是，这些算法的离散时间和随机性质排除了自适应控制文献中标准机械的直接应用来为系统提供确定性的稳定性证明。然而，我们将这些技术与随机近似文献中的工具一起利用这些技术，以证明当满足激发条件的一定持续性时，跟踪和参数误差将集中在零接近零。双摆的模拟示例证明了所提出的理论的实用性。 1

This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules. The primary advantage of the scheme over standard model-reference adaptive control techniques is that it does not require the learned inverse model to be invertible at all instances of time. This enables the use of general function approximators to approximate the linearizing controller for the system without having to worry about singularities. However, the discrete-time and stochastic nature of these algorithms precludes the direct application of standard machinery from the adaptive control literature to provide deterministic stability proofs for the system. Nevertheless, we leverage these techniques alongside tools from the stochastic approximation literature to demonstrate that with high probability the tracking and parameter errors concentrate near zero when a certain persistence of excitation condition is satisfied. A simulated example of a double pendulum demonstrates the utility of the proposed theory. 1

下载PDF全文

下载文献需遵守相关版权规定

论文标题