目标感知预测：学习建模重要的事情

论文标题

目标感知预测：学习建模重要的事情

Goal-Aware Prediction: Learning to Model What Matters

论文作者

Nair, Suraj, Savarese, Silvio, Finn, Chelsea

论文摘要

学习的动态模型以及计划和政策学习算法都表明了有望在有限的监督下学习执行许多不同任务的希望。但是，使用学习的前向动力学模型的基本挑战之一是学习模型（未来状态重建）的目标与下游计划者或策略（完成指定任务）之间的不匹配。在不同的现实世界环境中，基于视觉的控制任务加剧了这个问题，在现实世界中，现实世界中的复杂性矮小的模型容量。在本文中，我们建议将有关任务相关信息的预测指导，使模型能够意识到当前的任务，并鼓励其仅建模相关数量的状态空间，从而导致学习目标，从而更加匹配下游任务。此外，我们以一种完全自我监督的方式这样做，而无需奖励功能或图像标签。我们发现，我们的方法更有效地模拟了场景的相关部分以目标为条件，因此，我们的方法优于标准任务不合时宜的动力学模型和无模型的强化学习。

Learned dynamics models combined with both planning and policy learning algorithms have shown promise in enabling artificial agents to learn to perform many diverse tasks with limited supervision. However, one of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model (future state reconstruction), and that of the downstream planner or policy (completing a specified task). This issue is exacerbated by vision-based control tasks in diverse real-world environments, where the complexity of the real world dwarfs model capacity. In this paper, we propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space, resulting in a learning objective that more closely matches the downstream task. Further, we do so in an entirely self-supervised manner, without the need for a reward function or image labels. We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题