对象级3D语义映射使用智能边缘传感器网络

论文标题

对象级3D语义映射使用智能边缘传感器网络

Object-level 3D Semantic Mapping using a Network of Smart Edge Sensors

论文作者

Hau, Julian, Bultmann, Simon, Behnke, Sven

论文摘要

与环境相互作用的自主机器人需要一个详细的语义场景模型。为此，经常使用体积的语义图。可以通过在地图中包括对象级信息来进一步改善场景理解。在这项工作中，我们扩展了一个多视图3D语义映射系统，该系统由具有对象级信息的分布式智能边缘传感器网络组成，以启用需要对象级输入的下游任务。对象通过其3D网格模型在地图中表示，或者是以对象为中心的体积子图，可以在没有详细的3D模型的情况下对任意对象几何形状进行建模。我们提出了一种基于关键点的方法来通过PNP估算对象构成，并通过观察到的点云段的3D对象模型的ICP对齐进行改进。跟踪对象实例以随着时间的推移整合观察结果，并与临时遮挡保持鲁棒。我们的方法在公共行为数据集中进行了评估，在该数据集中，它显示了几厘米内的姿势估计精度，以及在充满挑战的实验室环境中，在传感器网络进行现实世界中的实验中，即使在高封闭情况下，可以实时在线跟踪多个椅子和桌子。

Autonomous robots that interact with their environment require a detailed semantic scene model. For this, volumetric semantic maps are frequently used. The scene understanding can further be improved by including object-level information in the map. In this work, we extend a multi-view 3D semantic mapping system consisting of a network of distributed smart edge sensors with object-level information, to enable downstream tasks that need object-level input. Objects are represented in the map via their 3D mesh model or as an object-centric volumetric sub-map that can model arbitrary object geometry when no detailed 3D model is available. We propose a keypoint-based approach to estimate object poses via PnP and refinement via ICP alignment of the 3D object model with the observed point cloud segments. Object instances are tracked to integrate observations over time and to be robust against temporary occlusions. Our method is evaluated on the public Behave dataset where it shows pose estimation accuracy within a few centimeters and in real-world experiments with the sensor network in a challenging lab environment where multiple chairs and a table are tracked through the scene online, in real time even under high occlusions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题