通过多视图共识的人类检测和分割

论文标题

通过多视图共识的人类检测和分割

Human Detection and Segmentation via Multi-view Consensus

论文作者

Katircioglu, Isinsu, Rhodin, Helge, Spörri, Jörg, Salzmann, Mathieu, Fua, Pascal

论文摘要

前景对象的自我监督检测和分割旨在在没有注释的培训数据的情况下进行准确性。但是，现有方法主要依赖于对外观和运动的限制性假设。对于具有动态活动和摄像机运动的场景，我们提出了一个多相机框架，其中在训练过程中通过粗3D定位在Voxel网格和细粒偏移回归中通过粗3D定位在训练过程中以多视图一致性的形式嵌入几何约束。通过这种方式，我们学习了通过多种观点的提案共同分布。在推理时，我们的方法在单个RGB图像上运行。我们在视觉上脱离标准基准和经典人为36m数据集的图像上的图像上都超过了最先进的技术。

Self-supervised detection and segmentation of foreground objects aims for accuracy without annotated training data. However, existing approaches predominantly rely on restrictive assumptions on appearance and motion. For scenes with dynamic activities and camera motion, we propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training via coarse 3D localization in a voxel grid and fine-grained offset regression. In this manner, we learn a joint distribution of proposals over multiple views. At inference time, our method operates on single RGB images. We outperform state-of-the-art techniques both on images that visually depart from those of standard benchmarks and on those of the classical Human3.6M dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题