关于线性二次调节器的无模型方法的分析

论文标题

关于线性二次调节器的无模型方法的分析

On the Analysis of Model-free Methods for the Linear Quadratic Regulator

论文作者

Jin, Zeyu, Schmitt, Johann Michael, Wen, Zaiwen

论文摘要

许多强化学习方法在实践中取得了巨大的成功，但缺乏理论基础。在本文中，我们研究了有关线性二次调节器（LQR）问题的收敛分析。为几种流行的算法（例如策略梯度算法，TD学习和Actor-Critic（AC）算法）建立了全局线性收敛性和样本复杂性。我们的结果表明，与策略梯度算法相比，参与者批评算法可以降低样品复杂性。尽管我们的分析仍然是初步的，但它在某种意义上解释了AC算法的好处。

Many reinforcement learning methods achieve great success in practice but lack theoretical foundation. In this paper, we study the convergence analysis on the problem of the Linear Quadratic Regulator (LQR). The global linear convergence properties and sample complexities are established for several popular algorithms such as the policy gradient algorithm, TD-learning and the actor-critic (AC) algorithm. Our results show that the actor-critic algorithm can reduce the sample complexity compared with the policy gradient algorithm. Although our analysis is still preliminary, it explains the benefit of AC algorithm in a certain sense.

下载PDF全文

下载文献需遵守相关版权规定

论文标题