从视图中利用线索：多视对象识别的自我监督和正规化学习

论文标题

从视图中利用线索：多视对象识别的自我监督和正规化学习

Exploit Clues from Views: Self-Supervised and Regularized Learning for Multiview Object Recognition

论文作者

Ho, Chih-Hui, Liu, Bo, Wu, Tz-Ying, Vasconcelos, Nuno

论文摘要

在文献中已经对多视图识别进行了很好的研究，并在对象识别和检索任务中取得了不错的表现。但是，大多数以前的作品都依赖于监督的学习和一些不切实际的基本假设，例如培训和推理时间中所有观点的可用性。在这项工作中，研究了多文章自我监督学习（MV-SSL）的问题，仅给出了对象关联的图像。鉴于此设置，通过追求“对象不变”表示，提出了一项新颖的自我监督学习的代孕任务。通过随机选择对象原型的图像特征来解决这一问题，并伴有多视图一致性正规化，这会导致视图不变的随机原型嵌入（Vispe）。实验表明，使用Vispe优于其他自我监督的学习方法的识别和检索结果在可见和看不见的数据上。 Vispe也可以应用于半监督场景，并在有限的数据中证明了稳健的性能。代码可从https://github.com/chihhuiho/vispe获得

Multiview recognition has been well studied in the literature and achieves decent performance in object recognition and retrieval task. However, most previous works rely on supervised learning and some impractical underlying assumptions, such as the availability of all views in training and inference time. In this work, the problem of multiview self-supervised learning (MV-SSL) is investigated, where only image to object association is given. Given this setup, a novel surrogate task for self-supervised learning is proposed by pursuing "object invariant" representation. This is solved by randomly selecting an image feature of an object as object prototype, accompanied with multiview consistency regularization, which results in view invariant stochastic prototype embedding (VISPE). Experiments shows that the recognition and retrieval results using VISPE outperform that of other self-supervised learning methods on seen and unseen data. VISPE can also be applied to semi-supervised scenario and demonstrates robust performance with limited data available. Code is available at https://github.com/chihhuiho/VISPE

下载PDF全文

下载文献需遵守相关版权规定

论文标题