论文标题

平均无奖励的无模型增强学习:系统评价和文献映射

Average-reward model-free reinforcement learning: a systematic review and literature mapping

论文作者

Dewanto, Vektor, Dunn, George, Eshragh, Ali, Gallagher, Marcus, Roosta, Fred

论文摘要

强化学习是人工智能的重要组成部分。在本文中,我们回顾了无模型的增强学习,该学习利用了无限地平线设置中的平均奖励最佳标准。由Mahadevan(1996a)的Solo调查的激励,我们提供了对该领域的工作的更新审查,并将其扩展到涵盖策略介绍和函数近似方法(除了价值意识和表格对应物之外)。我们提供了全面的文献映射。我们还确定并讨论未来工作的机会。

Reinforcement learning is important part of artificial intelligence. In this paper, we review model-free reinforcement learning that utilizes the average reward optimality criterion in the infinite horizon setting. Motivated by the solo survey by Mahadevan (1996a), we provide an updated review of work in this area and extend it to cover policy-iteration and function approximation methods (in addition to the value-iteration and tabular counterparts). We present a comprehensive literature mapping. We also identify and discuss opportunities for future work.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源