学生/老师通过奖励增强咨询

论文标题

学生/老师通过奖励增强咨询

Student/Teacher Advising through Reward Augmentation

论文作者

Reid, Cameron

论文摘要

转移学习是多种强化学习的重要新子领域，旨在通过使用知识来解决另一个问题或使用已经知道该问题的经纪人传达给它的知识来帮助代理人学习问题。当人们希望更改代理商的体系结构或学习算法时（因此，不必“从头开始”建立新知识时），这很有用，当新代理人经常在没有知识的情况下将新代理引入环境，或者代理必须适应类似但不同的问题时。使用（Torrey and Taylor 2013）提出的教师/学生框架，在代理到代理案件中取得了长足进展。但是，这种方法要求在其他每个强化学习环境中学习与学习的对待不同。在本文中，我提出了一种方法，该方法允许将教师/学生框架以一种直接自然地自然地融入更一般的强化学习框架的方式，通过将教师的反馈集成到学习代理人收到的奖励信号中，从而将教师/学生框架。我表明，这种方法可以显着提高演奏单玩家随机游戏的代理商的学习率；我举例说明了这种方法的潜在陷阱；我提出了有关此框架的进一步研究建设领域。

Transfer learning is an important new subfield of multiagent reinforcement learning that aims to help an agent learn about a problem by using knowledge that it has gained solving another problem, or by using knowledge that is communicated to it by an agent who already knows the problem. This is useful when one wishes to change the architecture or learning algorithm of an agent (so that the new knowledge need not be built "from scratch"), when new agents are frequently introduced to the environment with no knowledge, or when an agent must adapt to similar but different problems. Great progress has been made in the agent-to-agent case using the Teacher/Student framework proposed by (Torrey and Taylor 2013). However, that approach requires that learning from a teacher be treated differently from learning in every other reinforcement learning context. In this paper, I propose a method which allows the teacher/student framework to be applied in a way that fits directly and naturally into the more general reinforcement learning framework by integrating the teacher feedback into the reward signal received by the learning agent. I show that this approach can significantly improve the rate of learning for an agent playing a one-player stochastic game; I give examples of potential pitfalls of the approach; and I propose further areas of research building on this framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题