论文标题
使用逆动力学模型从像素进行计划
Planning from Pixels using Inverse Dynamics Models
论文作者
论文摘要
在高维观测空间中学习任务无关的动力学模型对于基于模型的RL代理来说可能具有挑战性。我们通过学习预测以任务完成为条件的未来动作的序列来学习潜在世界模型的新方法来学习潜在的世界模型。这些任务条件的模型可以将重点建模能力对任务持续的动态进行适应性重点,同时又是稀疏奖励计划计划的有效启发式启发式。我们评估了有关挑战视觉目标完成任务的方法,并且与先前的无模型方法相比,性能大幅提高。
Learning task-agnostic dynamics models in high-dimensional observation spaces can be challenging for model-based RL agents. We propose a novel way to learn latent world models by learning to predict sequences of future actions conditioned on task completion. These task-conditioned models adaptively focus modeling capacity on task-relevant dynamics, while simultaneously serving as an effective heuristic for planning with sparse rewards. We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches.