论文标题
关于利用关系进行可转移的几个细粒对象检测
On Utilizing Relationships for Transferable Few-Shot Fine-Grained Object Detection
论文作者
论文摘要
最先进的对象探测器是快速准确的,但是它们需要大量注释的训练数据才能获得良好的性能。但是,在实践中,获得特定特定任务的大量培训注释(即细粒注释)是昂贵的。相比之下,从文本中获得常识性关系,例如,“表灯是位于桌子顶部的灯”要容易得多。此外,诸如“顶部”之类的常识关系很容易以任务不合时宜的方式注释。在本文中,我们提出了一个概率模型,该模型使用这种关系知识将粗对象类别(例如“表”,“灯”)的现成的检测器转化为细粒类别(例如“ Table-lamp”)的检测器。我们证明,当提供极低量的细粒注释(整个数据集中的0.2 \%$)时,我们的方法,Reldect,可实现与基于FINETUNTINT的最先进对象检测器基线的性能。我们还证明,与看不见的数据集中的上述基线相比,ReldeTect能够利用关系信息的固有可传递性($+5 $地图点)(零摄影传输)。总而言之,我们演示了在数据集中使用关系进行对象检测的能力,在该数据集中,可以通过合适的关系将细颗粒对象类别类别链接到粗粒类别。
State-of-the-art object detectors are fast and accurate, but they require a large amount of well annotated training data to obtain good performance. However, obtaining a large amount of training annotations specific to a particular task, i.e., fine-grained annotations, is costly in practice. In contrast, obtaining common-sense relationships from text, e.g., "a table-lamp is a lamp that sits on top of a table", is much easier. Additionally, common-sense relationships like "on-top-of" are easy to annotate in a task-agnostic fashion. In this paper, we propose a probabilistic model that uses such relational knowledge to transform an off-the-shelf detector of coarse object categories (e.g., "table", "lamp") into a detector of fine-grained categories (e.g., "table-lamp"). We demonstrate that our method, RelDetect, achieves performance competitive to finetuning based state-of-the-art object detector baselines when an extremely low amount of fine-grained annotations is available ($0.2\%$ of entire dataset). We also demonstrate that RelDetect is able to utilize the inherent transferability of relationship information to obtain a better performance ($+5$ mAP points) than the above baselines on an unseen dataset (zero-shot transfer). In summary, we demonstrate the power of using relationships for object detection on datasets where fine-grained object categories can be linked to coarse-grained categories via suitable relationships.