侦探：稀疏对象检测的细心复发模型

论文标题

侦探：稀疏对象检测的细心复发模型

Detective: An Attentive Recurrent Model for Sparse Object Detection

论文作者

Kechaou, Amine, Martinez, Manuel, Haurilet, Monica, Stiefelhagen, Rainer

论文摘要

在这项工作中，我们提出了侦探 - 一个细心的对象检测器，以连续的方式识别图像中的对象。我们的网络基于编码器架构，该架构是编码器是卷积神经网络，解码器是卷积复发性神经网络，再加上注意机制。在每次迭代中，我们的解码器都使用注意机制专注于图像的相关部分，然后估算对象的类和边界框坐标。当前的对象检测模型会产生密集的预测，并依靠后处理来消除重复的预测。侦探是一个稀疏的对象检测器，每个对象实例都会生成单个边界框。但是，训练稀疏的对象检测器是具有挑战性的，因为它要求模型在实例级别上进行推理，而不仅仅是在类和空间级别上。我们提出了一种基于匈牙利算法的培训机制，并提出了平衡本地化和分类任务的损失。这使侦探能够在Pascal VOC对象检测数据集上获得有希望的结果。我们的实验表明，稀疏的对象检测是可能的，并且在预测对象的顺序中具有巨大的未来开发潜力。

In this work, we present Detective - an attentive object detector that identifies objects in images in a sequential manner. Our network is based on an encoder-decoder architecture, where the encoder is a convolutional neural network, and the decoder is a convolutional recurrent neural network coupled with an attention mechanism. At each iteration, our decoder focuses on the relevant parts of the image using an attention mechanism, and then estimates the object's class and the bounding box coordinates. Current object detection models generate dense predictions and rely on post-processing to remove duplicate predictions. Detective is a sparse object detector that generates a single bounding box per object instance. However, training a sparse object detector is challenging, as it requires the model to reason at the instance level and not just at the class and spatial levels. We propose a training mechanism based on the Hungarian algorithm and a loss that balances the localization and classification tasks. This allows Detective to achieve promising results on the PASCAL VOC object detection dataset. Our experiments demonstrate that sparse object detection is possible and has a great potential for future developments in applications where the order of the objects to be predicted is of interest.

下载PDF全文

下载文献需遵守相关版权规定

论文标题