秃鹰：部分形状的3D姿势的自制规范化

论文标题

秃鹰：部分形状的3D姿势的自制规范化

ConDor: Self-Supervised Canonicalization of 3D Pose for Partial Shapes

论文作者

Sajnani, Rahul, Poulenard, Adrien, Jain, Jivitesh, Dua, Radhika, Guibas, Leonidas J., Sridhar, Srinath

论文摘要

3D对象理解中的进展取决于手动规范的形状数据集，这些数据集包含具有一致位置和方向的实例（3D姿势）。这使得从互联网模型收集或深度传感器中，很难将这些方法推广到野外形状。秃鹰是一种自我监督的方法，可以学会为完整和部分3D点云的3D方向和位置进行规范化。我们建立在张量现场网络（TFN）的顶部，这是一类排列和旋转等值的类别，以及转换不变的3D网络。在推断过程中，我们的方法在任意姿势下采用看不见的完整或部分3D点云，并输出一个均等的规范姿势。在培训期间，该网络使用自学损失，从未经票据的完整和部分3D点云中学习规范姿势。秃鹰还可以在没有任何监督的情况下学习始终如一地共裂对象零件。四个新指标的广泛定量结果表明，我们的方法在启用新应用程序（例如在深度图像上操作和注释传输）等新应用程序时的表现优于现有方法。

Progress in 3D object understanding has relied on manually canonicalized shape datasets that contain instances with consistent position and orientation (3D pose). This has made it hard to generalize these methods to in-the-wild shapes, eg., from internet model collections or depth sensors. ConDor is a self-supervised method that learns to Canonicalize the 3D orientation and position for full and partial 3D point clouds. We build on top of Tensor Field Networks (TFNs), a class of permutation- and rotation-equivariant, and translation-invariant 3D networks. During inference, our method takes an unseen full or partial 3D point cloud at an arbitrary pose and outputs an equivariant canonical pose. During training, this network uses self-supervision losses to learn the canonical pose from an un-canonicalized collection of full and partial 3D point clouds. ConDor can also learn to consistently co-segment object parts without any supervision. Extensive quantitative results on four new metrics show that our approach outperforms existing methods while enabling new applications such as operation on depth images and annotation transfer.

下载PDF全文

下载文献需遵守相关版权规定

论文标题