对象检测的本地化蒸馏

论文标题

对象检测的本地化蒸馏

Localization Distillation for Object Detection

论文作者

Zheng, Zhaohui, Ye, Rongguang, Hou, Qibin, Ren, Dongwei, Wang, Ping, Zuo, Wangmeng, Cheng, Ming-Ming

论文摘要

对象检测的先前知识蒸馏（KD）方法主要集中在特征模仿上，而不是模仿预测逻辑，因为它在提炼本地化信息方面效率低下。在本文中，我们调查了Logit模仿特征模仿背后的落后。为了实现这一目标，我们首先提出了一种新颖的本地化蒸馏（LD）方法，该方法可以有效地将本地化知识从教师转移到学生。其次，我们介绍了有价值的本地化区域的概念，该区域可以有助于选择某个区域的分类和本地化知识。我们首次结合了这两个新组件，我们表明模仿的logit可以超越特征模仿，而没有定位蒸馏是为什么Logit多年来模仿表现不佳的原因。彻底的研究表现出模仿的logit的巨大潜力，可以显着缓解本地化的歧义，学习稳健的特征代表并缓解早期阶段的训练难度。我们还提供了所提出的LD和分类KD之间的理论联系，即它们具有同等优化效果。我们的蒸馏方案既简单又有效，并且可以轻松地应用于密集的水平对象探测器和旋转的对象探测器。在MS Coco，Pascal VOC和DOTA基准上进行的广泛实验表明，我们的方法可以实现相当大的AP改进，而无需在推理速度上进行任何牺牲。我们的源代码和预估计的模型可在https://github.com/hikaritju/ld上公开获取。

Previous knowledge distillation (KD) methods for object detection mostly focus on feature imitation instead of mimicking the prediction logits due to its inefficiency in distilling the localization information. In this paper, we investigate whether logit mimicking always lags behind feature imitation. Towards this goal, we first present a novel localization distillation (LD) method which can efficiently transfer the localization knowledge from the teacher to the student. Second, we introduce the concept of valuable localization region that can aid to selectively distill the classification and localization knowledge for a certain region. Combining these two new components, for the first time, we show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years. The thorough studies exhibit the great potential of logit mimicking that can significantly alleviate the localization ambiguity, learn robust feature representation, and ease the training difficulty in the early stage. We also provide the theoretical connection between the proposed LD and the classification KD, that they share the equivalent optimization effect. Our distillation scheme is simple as well as effective and can be easily applied to both dense horizontal object detectors and rotated object detectors. Extensive experiments on the MS COCO, PASCAL VOC, and DOTA benchmarks demonstrate that our method can achieve considerable AP improvement without any sacrifice on the inference speed. Our source code and pretrained models are publicly available at https://github.com/HikariTJU/LD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题