类别级别6D对象姿势估计的自我监督几何对应关系

论文标题

类别级别6D对象姿势估计的自我监督几何对应关系

Self-Supervised Geometric Correspondence for Category-Level 6D Object Pose Estimation in the Wild

论文作者

Zhang, Kaifeng, Fu, Yang, Borse, Shubhankar, Cai, Hong, Porikli, Fatih, Wang, Xiaolong

论文摘要

虽然6D对象姿势估计在计算机视觉和机器人技术中具有广泛的应用，但由于缺乏注释，它仍无法解决。当转向类别级别的6D姿势时，问题变得更加具有挑战性，这需要概括才能看不见的实例。当前的方法受到利用模拟或从人类收集的注释的限制。在本文中，我们通过引入一种直接在野外的类别级别6D姿势估计的大规模现实世界对象视频训练的自我监督的学习方法来克服这一障碍。我们的框架重建了对象类别的规范3D形状，并通过表面嵌入学习输入图像和规范形状之间的密集对应关系。在训练中，我们提出了新的几何循环段损失，这些损失在不同的实例和不同的时间步骤上构建了跨2d-3d空间周期的循环。可以将学习的对应关系用于6D姿势估计和其他下游任务，例如KePoint转移。令人惊讶的是，与以前在野外图像上的监督或半监督的方法相比，我们的方法没有任何人类注释或模拟器，可以实现PAR甚至更好的性能。我们的项目页面是：https：//kywind.github.io/self-pose。

While 6D object pose estimation has wide applications across computer vision and robotics, it remains far from being solved due to the lack of annotations. The problem becomes even more challenging when moving to category-level 6D pose, which requires generalization to unseen instances. Current approaches are restricted by leveraging annotations from simulation or collected from humans. In this paper, we overcome this barrier by introducing a self-supervised learning approach trained directly on large-scale real-world object videos for category-level 6D pose estimation in the wild. Our framework reconstructs the canonical 3D shape of an object category and learns dense correspondences between input images and the canonical shape via surface embedding. For training, we propose novel geometrical cycle-consistency losses which construct cycles across 2D-3D spaces, across different instances and different time steps. The learned correspondence can be applied for 6D pose estimation and other downstream tasks such as keypoint transfer. Surprisingly, our method, without any human annotations or simulators, can achieve on-par or even better performance than previous supervised or semi-supervised methods on in-the-wild images. Our project page is: https://kywind.github.io/self-pose .

下载PDF全文

下载文献需遵守相关版权规定

论文标题