论文标题

通过:重新识别人的部分自我监督预培训

PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification

论文作者

Zhu, Kuan, Guo, Haiyun, Yan, Tianyi, Zhu, Yousong, Wang, Jinqiao, Tang, Ming

论文摘要

在亲自重新识别(REID)中,最近的研究已经验证了未标记的人图像上的模型的预训练要比Imagenet上的图像要好得多。但是,这些研究直接应用了为图像分类设计的现有自我监督学习(SSL)方法,可用于REID,而无需在框架中进行任何适应。这些SSL方法将本地视图的输出(例如红色T恤,蓝色短裤)与同时的全球视图相匹配,从而丢失了许多细节。在本文中,我们提出了一种特定于REID的预训练方法,部分意识到的自我监管的预训练(PASS),该方法可以生成零件级别的功能以提供细粒度的信息,并且更适合REID。通行证将图像分为几个地方区域,每个区域随机裁剪的本地视图将分配给特定的可学习[部分]令牌。另一方面,所有地方区域的[部分]也附加到全球视图中。通行证学会与同一[部分]上的本地视图的输出和全局视图匹配。也就是说,从当地区域学习的本地视图的[部分]仅与从全球视图中学到的相应[部分]相匹配。结果,每个[部分]可以专注于图像的特定局部区域,并提取该区域的细粒度信息。实验显示通行证在Market1501和MSMT17上进行了各种REID任务上的新最新表演,例如,通过Pass Achieves 92.2 \%/90.2 \%/90.2 \%/88.5 \%MAP PECICAR MARKET1501在Market1501上进行监督/UDA/UDA/UDA/USL REID。我们的代码可在https://github.com/casia-iva-lab/pass-reid上找到。

In person re-identification (ReID), very recent researches have validated pre-training the models on unlabelled person images is much better than on ImageNet. However, these researches directly apply the existing self-supervised learning (SSL) methods designed for image classification to ReID without any adaption in the framework. These SSL methods match the outputs of local views (e.g., red T-shirt, blue shorts) to those of the global views at the same time, losing lots of details. In this paper, we propose a ReID-specific pre-training method, Part-Aware Self-Supervised pre-training (PASS), which can generate part-level features to offer fine-grained information and is more suitable for ReID. PASS divides the images into several local areas, and the local views randomly cropped from each area are assigned with a specific learnable [PART] token. On the other hand, the [PART]s of all local areas are also appended to the global views. PASS learns to match the output of the local views and global views on the same [PART]. That is, the learned [PART] of the local views from a local area is only matched with the corresponding [PART] learned from the global views. As a result, each [PART] can focus on a specific local area of the image and extracts fine-grained information of this area. Experiments show PASS sets the new state-of-the-art performances on Market1501 and MSMT17 on various ReID tasks, e.g., vanilla ViT-S/16 pre-trained by PASS achieves 92.2\%/90.2\%/88.5\% mAP accuracy on Market1501 for supervised/UDA/USL ReID. Our codes are available at https://github.com/CASIA-IVA-Lab/PASS-reID.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源