泰勒扩展政策优化

论文标题

泰勒扩展政策优化

Taylor Expansion Policy Optimization

论文作者

Tang, Yunhao, Valko, Michal, Munos, Rémi

论文摘要

在这项工作中，我们研究了泰勒扩展在增强学习中的应用。特别是，我们提出了泰勒扩展政策优化，这是一种策略优化形式主义，将先前的工作（例如TRPO）概括为一阶特殊情况。我们还表明，泰勒的扩张与非政策评估密切相关。最后，我们表明，这种新的配方需要修改，从而改善了几种最先进的分布式算法的性能。

In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor expansion policy optimization, a policy optimization formalism that generalizes prior work (e.g., TRPO) as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题