论文标题

空间中的散点点:从多视图单眼图像检测到3D

Scatter Points in Space: 3D Detection from Multi-view Monocular Images

论文作者

Liu, Jianlin, Huang, Zhuofei, Huang, Dihe, Xu, Shang, Chen, Ying, Liu, Yong

论文摘要

来自单眼图像的3D对象检测是计算机视觉的具有挑战性且长期存在的问题。为了从不同的角度组合信息而没有麻烦的2D实例跟踪,最近的方法倾向于通过在空间中密集的常规3D网格来汇总多视图功能,这是效率低下的。在本文中,我们试图通过提出可学习的关键点采样方法来改善多视图特征聚合,该方法将伪表面点散布在3D空间中,以保持数据稀疏性。然后,使用多视图几何约束和视觉特征增强的散射点,以推断场景中的对象位置和形状。为了明确地弥补单帧的局限性和模型多视图几何形状,我们进一步提出了一个表面滤波器模块以抑制噪声。实验结果表明,就3D检测而言,我们的方法的性能明显优于以前的作品(在某些类别的扫描仪上改善了0.1 AP)。该代码将公开可用。

3D object detection from monocular image(s) is a challenging and long-standing problem of computer vision. To combine information from different perspectives without troublesome 2D instance tracking, recent methods tend to aggregate multiview feature by sampling regular 3D grid densely in space, which is inefficient. In this paper, we attempt to improve multi-view feature aggregation by proposing a learnable keypoints sampling method, which scatters pseudo surface points in 3D space, in order to keep data sparsity. The scattered points augmented by multi-view geometric constraints and visual features are then employed to infer objects location and shape in the scene. To make up the limitations of single frame and model multi-view geometry explicitly, we further propose a surface filter module for noise suppression. Experimental results show that our method achieves significantly better performance than previous works in terms of 3D detection (more than 0.1 AP improvement on some categories of ScanNet). The code will be publicly available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源