论文标题
如何在(样本和计算)预算上训练点目标导航代理
How to Train PointGoal Navigation Agents on a (Sample and Compute) Budget
论文作者
论文摘要
PointGoal Navigation已经看到了最近的兴趣和进步,这受到了栖息地平台和相关挑战的刺激。在本文中,我们研究了在样本预算(7500万帧)和计算预算(1天1天)下的PointGoal导航。我们进行了一系列广泛的实验,累计总计超过50,000个GPU小时,使我们识别并讨论了许多表面上较小但重要的设计选择 - 优势估计程序(训练中的关键组成部分),视觉编码器体系结构以及看似较小的次要高参数变化。总体而言,这些设计选择可以带来比Savva等人中存在的基线的大量和一致的改进。在样本预算下,RGB-D代理的性能可改善Gibson的8个SPL(相对改善14%)和MatterPort3d的20个SPL(相对改善38%)。在计算预算下,RGB-D代理的性能在Gibson上提高了19个SPL(相对改善32%),Matterport3d的SPL 35(相对改善为220%)。我们希望我们的发现和建议能使社区的实验更加有效。
PointGoal navigation has seen significant recent interest and progress, spurred on by the Habitat platform and associated challenge. In this paper, we study PointGoal navigation under both a sample budget (75 million frames) and a compute budget (1 GPU for 1 day). We conduct an extensive set of experiments, cumulatively totaling over 50,000 GPU-hours, that let us identify and discuss a number of ostensibly minor but significant design choices -- the advantage estimation procedure (a key component in training), visual encoder architecture, and a seemingly minor hyper-parameter change. Overall, these design choices to lead considerable and consistent improvements over the baselines present in Savva et al. Under a sample budget, performance for RGB-D agents improves 8 SPL on Gibson (14% relative improvement) and 20 SPL on Matterport3D (38% relative improvement). Under a compute budget, performance for RGB-D agents improves by 19 SPL on Gibson (32% relative improvement) and 35 SPL on Matterport3D (220% relative improvement). We hope our findings and recommendations will make serve to make the community's experiments more efficient.