rnnpose：复发6-DOF对象姿势改进，具有鲁棒的对应场估计和姿势优化

论文标题

rnnpose：复发6-DOF对象姿势改进，具有鲁棒的对应场估计和姿势优化

RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust Correspondence Field Estimation and Pose Optimization

论文作者

Xu, Yan, Lin, Kwan-Yee, Zhang, Guofeng, Wang, Xiaogang, Li, Hongsheng

论文摘要

从单眼图像中估计的6-DOF对象姿势构成具有挑战性，并且通常需要进行高精度估算后进行后进行程序。在本文中，我们提出了一个基于重复的神经网络（RNN）进行对象姿势细化的框架，这对于错误的初始姿势和闭塞是可靠的。在复发期间，基于估计的对应场（在渲染的图像和观察到的图像之间），对象姿势细化被作为非线性最小二乘问题提出。然后，通过可区分的Levenberg-Marquardt（LM）算法来解决该问题，从而实现端到端培训。在每次迭代中，都会进行对应场估计和姿势细化，以恢复对象姿势。此外，为了提高遮挡的鲁棒性，我们基于3D模型的学习描述符和观察到的2D图像引入了一致性检查机制，这在姿势优化过程中会使不可靠的对应关系下降。对linemod，coclusion-linemod和YCB-Video数据集进行了广泛的实验，验证了我们方法的有效性，并证明了最新的性能。

6-DoF object pose estimation from a monocular image is challenging, and a post-refinement procedure is generally needed for high-precision estimation. In this paper, we propose a framework based on a recurrent neural network (RNN) for object pose refinement, which is robust to erroneous initial poses and occlusions. During the recurrent iterations, object pose refinement is formulated as a non-linear least squares problem based on the estimated correspondence field (between a rendered image and the observed image). The problem is then solved by a differentiable Levenberg-Marquardt (LM) algorithm enabling end-to-end training. The correspondence field estimation and pose refinement are conducted alternatively in each iteration to recover the object poses. Furthermore, to improve the robustness to occlusion, we introduce a consistency-check mechanism based on the learned descriptors of the 3D model and observed 2D images, which downweights the unreliable correspondences during pose optimization. Extensive experiments on LINEMOD, Occlusion-LINEMOD, and YCB-Video datasets validate the effectiveness of our method and demonstrate state-of-the-art performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题