论文标题
体现药物的通用监督信号
A General Purpose Supervisory Signal for Embodied Agents
论文作者
论文摘要
培训有效体现的AI代理通常涉及手动奖励工程,专家模仿,专业组件,例如地图或利用其他传感器进行深度和本地化。另一种方法是将神经体系结构与自我监督的目标一起使用,以鼓励更好地表示学习。在实践中,几乎没有保证这些自我监管的目标编码与任务相关的信息。我们提出了场景图对比度(SGC)损失,该损失使用场景图作为通用,仅训练,监督信号。 SGC损失消除了显式图解码,而是使用对比度学习将代理的表示与对环境的丰富图形编码保持一致。 SGC损失通常适用,易于实施,并鼓励编码对象的语义,关系和历史记录的表示形式。使用SGC损失,我们在三个具体任务上获得了显着的收益:对象导航,多对象导航和ARM点导航。最后,我们提出了研究和分析,这些研究表明了我们训练有素的表示编码有关环境的语义提示的能力。
Training effective embodied AI agents often involves manual reward engineering, expert imitation, specialized components such as maps, or leveraging additional sensors for depth and localization. Another approach is to use neural architectures alongside self-supervised objectives which encourage better representation learning. In practice, there are few guarantees that these self-supervised objectives encode task-relevant information. We propose the Scene Graph Contrastive (SGC) loss, which uses scene graphs as general-purpose, training-only, supervisory signals. The SGC loss does away with explicit graph decoding and instead uses contrastive learning to align an agent's representation with a rich graphical encoding of its environment. The SGC loss is generally applicable, simple to implement, and encourages representations that encode objects' semantics, relationships, and history. Using the SGC loss, we attain significant gains on three embodied tasks: Object Navigation, Multi-Object Navigation, and Arm Point Navigation. Finally, we present studies and analyses which demonstrate the ability of our trained representation to encode semantic cues about the environment.