无监督的域通过K-Reciprocal聚类和大规模的异质环境合成的无监督域改编

论文标题

无监督的域通过K-Reciprocal聚类和大规模的异质环境合成的无监督域改编

Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal Clustering and Large-Scale Heterogeneous Environment Synthesis

论文作者

Kumar, Devinder, Siva, Parthipan, Marchwica, Paul, Wong, Alexander

论文摘要

计算机视觉中的一个持续的主要挑战是人重新识别的任务，其目标是匹配各种非重叠摄像头视图的个人。尽管通过使用深层神经网络的监督学习获得了最近的成功，但由于需要大规模定制的数据注释，这种方法的广泛采用有限。因此，最近一直关注无监督的学习方法来减轻数据注释问题。但是，与监督学习方法相比，目前的文献方法的性能有限，并且在新环境中采用的适用性有限。在本文中，我们通过对现实世界进行重新认同面临的上述挑战，通过引入一种小说，无监督的领域适应方法，以重新确定人的实际情况。这是通过引入：i）无监督域适应（KTCUDA）（用于目标域上的伪标记生成）的K-reciprocal tracklet聚类来实现的，而II）合成了异质重新ID领域（shred）组成的大型异型环境（以改进能力范围的环境），以改善强大的环境（以适应性的环境），以适应性的环境（适用于强大的环境）和适应性环境。在四个不同的图像和视频基准数据集中的实验结果表明，与现有的最新方法相比，提出的KTCUDA和切碎方法在重新识别性能中的平均改善在重新识别性能中的平均改善，并且证明了对不同类型环境的适应性更好。

An ongoing major challenge in computer vision is the task of person re-identification, where the goal is to match individuals across different, non-overlapping camera views. While recent success has been achieved via supervised learning using deep neural networks, such methods have limited widespread adoption due to the need for large-scale, customized data annotation. As such, there has been a recent focus on unsupervised learning approaches to mitigate the data annotation issue; however, current approaches in literature have limited performance compared to supervised learning approaches as well as limited applicability for adoption in new environments. In this paper, we address the aforementioned challenges faced in person re-identification for real-world, practical scenarios by introducing a novel, unsupervised domain adaptation approach for person re-identification. This is accomplished through the introduction of: i) k-reciprocal tracklet Clustering for Unsupervised Domain Adaptation (ktCUDA) (for pseudo-label generation on target domain), and ii) Synthesized Heterogeneous RE-id Domain (SHRED) composed of large-scale heterogeneous independent source environments (for improving robustness and adaptability to a wide diversity of target environments). Experimental results across four different image and video benchmark datasets show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance when compared to existing state-of-the-art methods, as well as demonstrate better adaptability to different types of environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题