论文标题

使用多Query Transformer的端到端跟踪

End-to-end Tracking with a Multi-query Transformer

论文作者

Korbar, Bruno, Zisserman, Andrew

论文摘要

多对象跟踪(MOT)是一项具有挑战性的任务,需要同时推理场景中对象的位置,外观和身份。本文我们的目的是超越逐个跟踪的方法,这些方法在已知对象类别的数据集中表现良好,可以很好地进行类别的跟踪,该跟踪对未知对象类的性能也很好,为此,我们做出以下三个贡献:首先,我们介绍{\ em em语义探测器查询},以使对象具有近似位置,或者将其近似位置置于局部性或近似位置,或者介绍其近似位置,或者介绍其近似位置,或者介绍其近似;其次,我们将这些查询在自动回归框架中进行跟踪,并提出了基于变压器架构的多样性跟踪变压器(\ textIt {MQT})模型同时跟踪和基于外观的重新识别(REID)。该公式使跟踪器可以以类不足的方式进行操作,并且可以训练该模型的端到端。最后,我们证明\ textIt {MQT}在标准MOT基准上竞争性能,优于一般示波器上的所有基准,并且对更难跟踪问题(例如跟踪TAO数据集中的任何对象)的概括。

Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time. Our aim in this paper is to move beyond tracking-by-detection approaches, that perform well on datasets where the object classes are known, to class-agnostic tracking that performs well also for unknown object classes.To this end, we make the following three contributions: first, we introduce {\em semantic detector queries} that enable an object to be localized by specifying its approximate position, or its appearance, or both; second, we use these queries within an auto-regressive framework for tracking, and propose a multi-query tracking transformer (\textit{MQT}) model for simultaneous tracking and appearance-based re-identification (reID) based on the transformer architecture with deformable attention. This formulation allows the tracker to operate in a class-agnostic manner, and the model can be trained end-to-end; finally, we demonstrate that \textit{MQT} performs competitively on standard MOT benchmarks, outperforms all baselines on generalised-MOT, and generalises well to a much harder tracking problems such as tracking any object on the TAO dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源