在线描述符通过自觉数据关联的自标志性三胞胎增强

论文标题

在线描述符通过自觉数据关联的自标志性三胞胎增强

Online Descriptor Enhancement via Self-Labelling Triplets for Visual Data Association

论文作者

Shaoul, Yorai, Liu, Katherine, Ok, Kyel, Roy, Nicholas

论文摘要

对象级数据关联是机器人应用程序的核心，例如跟踪逐个检测和对象级同时定位和映射。尽管当前学到的视觉数据关联方法的表现优于手工制作的算法，但许多人依赖于大量的特定领域培训示例，而这些示例可能很难在没有先验知识的情况下获得。此外，此类方法通常在推理期间保持固定，并且不会利用信息以改善其性能。我们提出了一种自我监督的方法，用于逐步完善视觉描述符，以提高对象级可视数据关联任务的性能。我们的方法通过连续培训通过独立于域的数据预先训练的广泛可用的图像分类网络来优化深层描述符的生成器。我们表明，网络中的较早层优于数据关联任务的后期层，同时还允许参数数量减少94％，从而实现在线优化。我们表明，自我标记的具有挑战性的三胞胎 - 选择了由大的时间距离和描述符空间中的负面示例隔开的积极示例，可以证明多对象跟踪任务的学习描述符质量。最后，我们证明我们的方法超过了应用于逐个检测任务的其他视觉数据关联方法，并证明与尝试适应观察到的信息的其他方法相比，它提供了更好的性能获得。

Object-level data association is central to robotic applications such as tracking-by-detection and object-level simultaneous localization and mapping. While current learned visual data association methods outperform hand-crafted algorithms, many rely on large collections of domain-specific training examples that can be difficult to obtain without prior knowledge. Additionally, such methods often remain fixed during inference-time and do not harness observed information to better their performance. We propose a self-supervised method for incrementally refining visual descriptors to improve performance in the task of object-level visual data association. Our method optimizes deep descriptor generators online, by continuously training a widely available image classification network pre-trained with domain-independent data. We show that earlier layers in the network outperform later-stage layers for the data association task while also allowing for a 94% reduction in the number of parameters, enabling the online optimization. We show that self-labelling challenging triplets--choosing positive examples separated by large temporal distances and negative examples close in the descriptor space--improves the quality of the learned descriptors for the multi-object tracking task. Finally, we demonstrate that our approach surpasses other visual data-association methods applied to a tracking-by-detection task, and show that it provides better performance-gains when compared to other methods that attempt to adapt to observed information.

下载PDF全文

下载文献需遵守相关版权规定

论文标题