与包括动作的反馈的互动学习

论文标题

与包括动作的反馈的互动学习

Interaction-Grounded Learning with Action-inclusive Feedback

论文作者

Xie, Tengyang, Saran, Akanksha, Foster, Dylan J., Molu, Lekan, Momennejad, Ida, Jiang, Nan, Mineiro, Paul, Langford, John

论文摘要

考虑互动学习的问题设定（IGL），其中学习者的目标是与环境进行最佳互动，而没有明确奖励其政策。代理观察上下文向量，采取措施并接收反馈向量，并使用此信息有效地优化潜在奖励功能的策略。当反馈向量包含该动作时，事先分析的方法失败了，这在许多潜在方案（例如脑部计算机界面（BCI）或人类计算机界面（HCI）应用程序）中显着限制了IGL的成功。我们通过创建算法和分析来解决此问题，该算法即使反馈向量包含以任何方式编码的动作，允许IGL起作用。我们根据监督数据集提供理论保证和大规模实验，以证明新方法的有效性。

Consider the problem setting of Interaction-Grounded Learning (IGL), in which a learner's goal is to optimally interact with the environment with no explicit reward to ground its policies. The agent observes a context vector, takes an action, and receives a feedback vector, using this information to effectively optimize a policy with respect to a latent reward function. Prior analyzed approaches fail when the feedback vector contains the action, which significantly limits IGL's success in many potential scenarios such as Brain-computer interface (BCI) or Human-computer interface (HCI) applications. We address this by creating an algorithm and analysis which allows IGL to work even when the feedback vector contains the action, encoded in any fashion. We provide theoretical guarantees and large-scale experiments based on supervised datasets to demonstrate the effectiveness of the new approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题