通过图形层次结构统一短期和长期跟踪

论文标题

通过图形层次结构统一短期和长期跟踪

Unifying Short and Long-Term Tracking with Graph Hierarchies

论文作者

Cetintas, Orcun, Brasó, Guillem, Leal-Taixé, Laura

论文摘要

在长视频上跟踪对象有效地意味着解决一系列问题，从短期关联的非封闭式对象的关联到长期关联的长期关联，用于被遮挡的对象，然后在现场重新出现。解决这两个任务的方法通常是不相交的，并且针对特定方案而设计，而且表现最好的方法通常是技术的混合，从而产生了缺乏通用性的工程较重的解决方案。在这项工作中，我们质疑混合方法的需求，并引入统一且可扩展的多对象跟踪器寿司。我们的方法通过将它们分成长片的层次结构来处理长剪辑，从而可以高可扩展性。我们利用图形神经网络来处理层次结构的所有级别，这使我们的模型跨时间尺度统一并高度笼统。结果，我们在四个不同的数据集上获得了对最先进的改进。我们的代码和型号可在bit.ly/sushi-mot上找到。

Tracking objects over long videos effectively means solving a spectrum of problems, from short-term association for un-occluded objects to long-term association for objects that are occluded and then reappear in the scene. Methods tackling these two tasks are often disjoint and crafted for specific scenarios, and top-performing approaches are often a mix of techniques, which yields engineering-heavy solutions that lack generality. In this work, we question the need for hybrid approaches and introduce SUSHI, a unified and scalable multi-object tracker. Our approach processes long clips by splitting them into a hierarchy of subclips, which enables high scalability. We leverage graph neural networks to process all levels of the hierarchy, which makes our model unified across temporal scales and highly general. As a result, we obtain significant improvements over state-of-the-art on four diverse datasets. Our code and models are available at bit.ly/sushi-mot.

下载PDF全文

下载文献需遵守相关版权规定

论文标题