进行均质的方式学习和可见红外人员重新识别的多晶格信息探索

论文标题

进行均质的方式学习和可见红外人员重新识别的多晶格信息探索

Towards Homogeneous Modality Learning and Multi-Granularity Information Exploration for Visible-Infrared Person Re-Identification

论文作者

Liu, Haojie, Xia, Daoxun, Jiang, Wei, Xu, Chao

论文摘要

可见的红外人员重新识别（VI-REID）是一项具有挑战性且必不可少的任务，旨在通过可见的和红外摄像机的视图检索一组人图像。为了减轻异质图像中存在的大型模态差异的影响，以前的方法试图应用生成性对抗网络（GAN）生成模态征收数据。但是，由于可见的域和红外域之间的颜色变化严重，产生的假跨模式样品通常无法具有良好的品质来填补综合场景和目标真实方案之间的模态差距，从而导致次级特征表征。在这项工作中，我们通过对齐灰度模式（AGM）解决了跨模式匹配的问题，这是一种统一的黑线频谱，将可见的信号双模式学习作为灰色灰色的单模学习问题，将可见的信号双模式学习重新制定。具体而言，我们从均匀的可见图像中产生了刻板的方式。然后，我们训练一种样式的Tranfer模型，将红外图像转移到均匀的灰度图像中。这样，图像空间中的模态差异大大减少。为了减少剩余的外观差异，我们进一步引入了多个跨性特征提取网络，以进行特征级别的比对。我们建议不依靠全球信息，而是建议利用本地（头肩）功能来协助Re-ID，该功能相互补充以形成更强的功能描述符。主流评估数据集实施的全面实验包括SYSU-MM01和REGDB表明，我们的方法可以显着提高与最先进方法的跨模式检索性能。

Visible-infrared person re-identification (VI-ReID) is a challenging and essential task, which aims to retrieve a set of person images over visible and infrared camera views. In order to mitigate the impact of large modality discrepancy existing in heterogeneous images, previous methods attempt to apply generative adversarial network (GAN) to generate the modality-consisitent data. However, due to severe color variations between the visible domain and infrared domain, the generated fake cross-modality samples often fail to possess good qualities to fill the modality gap between synthesized scenarios and target real ones, which leads to sub-optimal feature representations. In this work, we address cross-modality matching problem with Aligned Grayscale Modality (AGM), an unified dark-line spectrum that reformulates visible-infrared dual-mode learning as a gray-gray single-mode learning problem. Specifically, we generate the grasycale modality from the homogeneous visible images. Then, we train a style tranfer model to transfer infrared images into homogeneous grayscale images. In this way, the modality discrepancy is significantly reduced in the image space. In order to reduce the remaining appearance discrepancy, we further introduce a multi-granularity feature extraction network to conduct feature-level alignment. Rather than relying on the global information, we propose to exploit local (head-shoulder) features to assist person Re-ID, which complements each other to form a stronger feature descriptor. Comprehensive experiments implemented on the mainstream evaluation datasets include SYSU-MM01 and RegDB indicate that our method can significantly boost cross-modality retrieval performance against the state of the art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题