论文标题
SAP-DRET:桥接基于查询点的突出点和基于查询的变压器检测器的差距,以进行快速模型收敛
SAP-DETR: Bridging the Gap Between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency
论文作者
论文摘要
最近,在加速变压器检测器收敛之前,基于DITR的主要方法应用了中央概念空间。这些方法逐渐完善了参考对象中心的参考点,并与更新的中央参考信息查询有关空间有条件注意的中心参考信息。但是,集中参考点可能会严重恶化查询的显着性,并且由于先验的不加区分空间而使检测器感到困惑。为了弥合显着查询的参考点和变压器检测器之间的差距,我们通过将对象检测视为从显着点到实例对象的转换,提出了基于点的DETR(SAP-DER)(SAP-DERT)。在SAP-DETR中,我们明确地将每个对象查询的特定查询参考点初始化,将它们逐渐汇总到实例对象中,然后预测从边界框的每一侧到这些点的距离。通过迅速从图像特征迅速参与特定的参考区域和其他条件极端区域,SAP-DRET可以有效地弥合显着点和基于查询的变压器检测器之间的间隙,并具有显着的收敛速度。我们的广泛实验表明,SAP-DRETIE达到了1.4倍的融合速度,并具有竞争性能。在标准培训方案下,SAP-DETR稳定地促进了SOTA方法1.0 AP。基于Resnet-DC-101,SAP-DERTE达到46.9 AP。
Recently, the dominant DETR-based approaches apply central-concept spatial prior to accelerate Transformer detector convergency. These methods gradually refine the reference points to the center of target objects and imbue object queries with the updated central reference information for spatially conditional attention. However, centralizing reference points may severely deteriorate queries' saliency and confuse detectors due to the indiscriminative spatial prior. To bridge the gap between the reference points of salient queries and Transformer detectors, we propose SAlient Point-based DETR (SAP-DETR) by treating object detection as a transformation from salient points to instance objects. In SAP-DETR, we explicitly initialize a query-specific reference point for each object query, gradually aggregate them into an instance object, and then predict the distance from each side of the bounding box to these points. By rapidly attending to query-specific reference region and other conditional extreme regions from the image features, SAP-DETR can effectively bridge the gap between the salient point and the query-based Transformer detector with a significant convergency speed. Our extensive experiments have demonstrated that SAP-DETR achieves 1.4 times convergency speed with competitive performance. Under the standard training scheme, SAP-DETR stably promotes the SOTA approaches by 1.0 AP. Based on ResNet-DC-101, SAP-DETR achieves 46.9 AP.