论文标题
是什么使端到端对象检测?
What Makes for End-to-End Object Detection?
论文作者
论文摘要
对象检测最近取得了突破性的突破性,可以删除管道中的最后一个非差异组件,非最大最大抑制(NMS)并构建端到端系统。但是,造就一对一预测的原因尚未得到充分理解。在本文中,我们首先指出,一对一的积极样本分配是关键因素,而先前检测器中的一对多分配会导致推理中的冗余预测。其次,我们出人意料地发现,即使是一对一的分配训练,先前的探测器仍然会产生冗余的预测。我们确定匹配成本的分类成本是主要成分:(1)先前的检测器仅考虑位置成本,(2)另外引入分类成本,以前的检测器在推理期间立即产生一对一的预测。我们介绍了分数差距的概念,以探索匹配成本的效果。分类成本通过选择积极样本作为训练迭代中最高分的样本,并减少仅位置成本带来的噪声正样本,从而增加了得分差距。最后,我们证明了拥挤的场景中端到端对象检测的优势。该代码可在:\ url {https://github.com/peizesun/onenet}中获得。
Object detection has recently achieved a breakthrough for removing the last one non-differentiable component in the pipeline, Non-Maximum Suppression (NMS), and building up an end-to-end system. However, what makes for its one-to-one prediction has not been well understood. In this paper, we first point out that one-to-one positive sample assignment is the key factor, while, one-to-many assignment in previous detectors causes redundant predictions in inference. Second, we surprisingly find that even training with one-to-one assignment, previous detectors still produce redundant predictions. We identify that classification cost in matching cost is the main ingredient: (1) previous detectors only consider location cost, (2) by additionally introducing classification cost, previous detectors immediately produce one-to-one prediction during inference. We introduce the concept of score gap to explore the effect of matching cost. Classification cost enlarges the score gap by choosing positive samples as those of highest score in the training iteration and reducing noisy positive samples brought by only location cost. Finally, we demonstrate the advantages of end-to-end object detection on crowded scenes. The code is available at: \url{https://github.com/PeizeSun/OneNet}.