如何在（样本和计算）预算上训练点目标导航代理

论文标题

如何在（样本和计算）预算上训练点目标导航代理

How to Train PointGoal Navigation Agents on a (Sample and Compute) Budget

论文作者

Wijmans, Erik, Essa, Irfan, Batra, Dhruv

论文摘要

PointGoal Navigation已经看到了最近的兴趣和进步，这受到了栖息地平台和相关挑战的刺激。在本文中，我们研究了在样本预算（7500万帧）和计算预算（1天1天）下的PointGoal导航。我们进行了一系列广泛的实验，累计总计超过50,000个GPU小时，使我们识别并讨论了许多表面上较小但重要的设计选择 - 优势估计程序（训练中的关键组成部分），视觉编码器体系结构以及看似较小的次要高参数变化。总体而言，这些设计选择可以带来比Savva等人中存在的基线的大量和一致的改进。在样本预算下，RGB-D代理的性能可改善Gibson的8个SPL（相对改善14％）和MatterPort3d的20个SPL（相对改善38％）。在计算预算下，RGB-D代理的性能在Gibson上提高了19个SPL（相对改善32％），Matterport3d的SPL 35（相对改善为220％）。我们希望我们的发现和建议能使社区的实验更加有效。

PointGoal navigation has seen significant recent interest and progress, spurred on by the Habitat platform and associated challenge. In this paper, we study PointGoal navigation under both a sample budget (75 million frames) and a compute budget (1 GPU for 1 day). We conduct an extensive set of experiments, cumulatively totaling over 50,000 GPU-hours, that let us identify and discuss a number of ostensibly minor but significant design choices -- the advantage estimation procedure (a key component in training), visual encoder architecture, and a seemingly minor hyper-parameter change. Overall, these design choices to lead considerable and consistent improvements over the baselines present in Savva et al. Under a sample budget, performance for RGB-D agents improves 8 SPL on Gibson (14% relative improvement) and 20 SPL on Matterport3D (38% relative improvement). Under a compute budget, performance for RGB-D agents improves by 19 SPL on Gibson (32% relative improvement) and 35 SPL on Matterport3D (220% relative improvement). We hope our findings and recommendations will make serve to make the community's experiments more efficient.

下载PDF全文

下载文献需遵守相关版权规定

论文标题