Scone：通过体积集成在未知环境中优化表面覆盖范围

论文标题

Scone：通过体积集成在未知环境中优化表面覆盖范围

SCONE: Surface Coverage Optimization in Unknown Environments by Volumetric Integration

论文作者

Guédon, Antoine, Monasse, Pascal, Lepetit, Vincent

论文摘要

下一个最佳视图计算（NBV）是机器人技术中的长期问题，并包括确定下一个最有用的传感器位置，以有效，准确地重建3D对象或场景。像大多数当前方法一样，我们考虑了来自LiDAR系统的深度传感器的NBV预测。基于学习的方法依靠场景的体积表示适合路径计划，但与使用基于表面的表示的方法相比，精度较低。但是，后者的尺寸不能很好地扩展，并将相机限制为少量姿势。为了获得两种表示的优势，我们表明我们可以通过蒙特卡洛整合在体积表示上最大化表面指标。特别是，我们提出了一种依赖两个神经模块的方法：第一个模块可以预测场景整体中的占用概率。有了任何新的摄像头姿势，第二个模块根据其占用概率指向场景中的样本，并利用自我发挥的机制来预测样品的可见性。最后，我们整合了可见性，以评估新相机姿势的表面覆盖率。选择NBV作为最大化总表面覆盖率增益的姿势。我们的方法缩放到大型场景并处理自由相机运动：它以深度传感器收集的任意大点云以及相机姿势来预测NBV。我们在由大而复杂的3D场景制成的新型数据集上演示了我们的方法。

Next Best View computation (NBV) is a long-standing problem in robotics, and consists in identifying the next most informative sensor position(s) for reconstructing a 3D object or scene efficiently and accurately. Like most current methods, we consider NBV prediction from a depth sensor like Lidar systems. Learning-based methods relying on a volumetric representation of the scene are suitable for path planning, but have lower accuracy than methods using a surface-based representation. However, the latter do not scale well with the size of the scene and constrain the camera to a small number of poses. To obtain the advantages of both representations, we show that we can maximize surface metrics by Monte Carlo integration over a volumetric representation. In particular, we propose an approach, SCONE, that relies on two neural modules: The first module predicts occupancy probability in the entire volume of the scene. Given any new camera pose, the second module samples points in the scene based on their occupancy probability and leverages a self-attention mechanism to predict the visibility of the samples. Finally, we integrate the visibility to evaluate the gain in surface coverage for the new camera pose. NBV is selected as the pose that maximizes the gain in total surface coverage. Our method scales to large scenes and handles free camera motion: It takes as input an arbitrarily large point cloud gathered by a depth sensor as well as camera poses to predict NBV. We demonstrate our approach on a novel dataset made of large and complex 3D scenes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题