探索有效的知识转移以进行几次射击对象检测

论文标题

探索有效的知识转移以进行几次射击对象检测

Exploring Effective Knowledge Transfer for Few-shot Object Detection

论文作者

Zhao, Zhiyuan, Liu, Qingjie, Wang, Yunhong

论文摘要

最近，很少有射击对象检测〜（FSOD）受到社区的关注，并提出了许多方法来从知识转移的角度解决此问题。尽管已经取得了令人鼓舞的结果，但这些方法无法实现稳定的稳定：〜在低射门方案中表现出色的方法可能会在高射击制度中挣扎，反之亦然。我们认为这是因为当镜头数量变化时，FSOD的主要挑战会改变。在低射击方案中，主要挑战是缺乏内层差异。在高射击制度中，随着差异接近真正的差异，对性能的主要障碍来自于学识渊博的分布和真实分布之间的错位。但是，在大多数现有的FSOD方法中，这两个不同的问题仍未解决。在本文中，我们建议通过利用丰富的知识来克服这些挑战，该模型已经学会并有效地将其转移到新颖的类别中。对于低射门制度，我们提出了一种分布校准方法来解决缺乏内类变异问题。同时，提出了一种偏移补偿方法来补偿微调过程中可能的分布变化。对于高射击制度，我们建议将从Imagenet学到的知识作为在微调阶段的特征学习的指导，这将隐含地对齐新颖类的分布。尽管针对不同的制度，但这两种策略可以共同努力，以进一步提高FSOD绩效。对VOC和可可基准测试的实验表明，我们提出的方法可以显着胜过基线方法，并在低射击设置（Shot <5）和高弹位设置（Shot> = 5）中产生竞争性结果。代码可在https://github.com/juliozhao97/efftrans_fsdet.git上找到。

Recently, few-shot object detection~(FSOD) has received much attention from the community, and many methods are proposed to address this problem from a knowledge transfer perspective. Though promising results have been achieved, these methods fail to achieve shot-stable:~methods that excel in low-shot regimes are likely to struggle in high-shot regimes, and vice versa. We believe this is because the primary challenge of FSOD changes when the number of shots varies. In the low-shot regime, the primary challenge is the lack of inner-class variation. In the high-shot regime, as the variance approaches the real one, the main hindrance to the performance comes from misalignment between learned and true distributions. However, these two distinct issues remain unsolved in most existing FSOD methods. In this paper, we propose to overcome these challenges by exploiting rich knowledge the model has learned and effectively transferring them to the novel classes. For the low-shot regime, we propose a distribution calibration method to deal with the lack of inner-class variation problem. Meanwhile, a shift compensation method is proposed to compensate for possible distribution shift during fine-tuning. For the high-shot regime, we propose to use the knowledge learned from ImageNet as guidance for the feature learning in the fine-tuning stage, which will implicitly align the distributions of the novel classes. Although targeted toward different regimes, these two strategies can work together to further improve the FSOD performance. Experiments on both the VOC and COCO benchmarks show that our proposed method can significantly outperform the baseline method and produce competitive results in both low-shot settings (shot<5) and high-shot settings (shot>=5). Code is available at https://github.com/JulioZhao97/EffTrans_Fsdet.git.

下载PDF全文

下载文献需遵守相关版权规定

论文标题