视觉界面人员REID的多模式数据扩展与损坏的数据

论文标题

视觉界面人员REID的多模式数据扩展与损坏的数据

Multimodal Data Augmentation for Visual-Infrared Person ReID with Corrupted Data

论文作者

Josi, Arthur, Alehdaghi, Mahdi, Cruz, Rafael M. O., Granger, Eric

论文摘要

通过复杂的相机网络对个人的重新识别（REID）是一项具有挑战性的任务，尤其是在现实的监视条件下。已经提出了几种深度学习模型，用于可见的红外（V-I）REID，以从使用RGB和IR摄像机捕获的图像中识别个人。但是，如果在测试时间捕获的RGB和IR图像被损坏（例如噪声，模糊和天气状况），则性能可能会大大下降。尽管已经探索了各种数据增强（DA）方法以提高概括能力，但这些方法并未适用于V-I人REID。在本文中，提出了一种专门的DA策略来解决此多模式设置。鉴于V和I模式，该策略允许降低腐败对深人REID模型准确性的影响。腐败可能是特定于模式的，另一种方式通常提供补充信息。我们的多模式DA策略专门旨在鼓励模式协作和增强概括能力。例如，守时模态的掩盖迫使模型选择信息方式。还探索了本地DA，以进行高级选择方式内部和方式。根据复杂性和效率，对使用拟议的多模式DA策略对V-I人REID的训练基线融合模型对SYSU-MM01，REGDB和ThermalWorld数据集的损坏版本进行了评估。结果表明，使用我们的策略提供V-I REID模型可以利用共享和个人模态知识的能力，因此他们可以超越没有或单峰DA训练的模型。 github代码：https：//github.com/art2611/ml-mda。

The re-identification (ReID) of individuals over a complex network of cameras is a challenging task, especially under real-world surveillance conditions. Several deep learning models have been proposed for visible-infrared (V-I) person ReID to recognize individuals from images captured using RGB and IR cameras. However, performance may decline considerably if RGB and IR images captured at test time are corrupted (e.g., noise, blur, and weather conditions). Although various data augmentation (DA) methods have been explored to improve the generalization capacity, these are not adapted for V-I person ReID. In this paper, a specialized DA strategy is proposed to address this multimodal setting. Given both the V and I modalities, this strategy allows to diminish the impact of corruption on the accuracy of deep person ReID models. Corruption may be modality-specific, and an additional modality often provides complementary information. Our multimodal DA strategy is designed specifically to encourage modality collaboration and reinforce generalization capability. For instance, punctual masking of modalities forces the model to select the informative modality. Local DA is also explored for advanced selection of features within and among modalities. The impact of training baseline fusion models for V-I person ReID using the proposed multimodal DA strategy is assessed on corrupted versions of the SYSU-MM01, RegDB, and ThermalWORLD datasets in terms of complexity and efficiency. Results indicate that using our strategy provides V-I ReID models the ability to exploit both shared and individual modality knowledge so they can outperform models trained with no or unimodal DA. GitHub code: https://github.com/art2611/ML-MDA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题