论文标题
用于视频对象检测的强大而有效的后处理
Robust and efficient post-processing for video object detection
论文作者
论文摘要
视频中的对象识别是许多应用程序的重要任务,包括自主驾驶感知,监视任务,可穿戴设备或物联网网络。使用视频数据的对象识别比使用模糊,遮挡或稀有对象姿势的静止图像更具挑战性。具有高计算成本或标准图像探测器的特定视频检测器以及快速的后处理算法实现了当前的最新技术。这项工作引入了一条新型的后处理管道,该管道通过引入基于学习的相似性评估来克服以前的后处理方法的某些局限性。我们的方法改善了最新的特定视频探测器的结果,特别是关于快速移动对象的结果,并提出了低资源要求。并应用于有效的静止图像探测器(例如Yolo),提供了与更多计算密集型检测器的可比结果。
Object recognition in video is an important task for plenty of applications, including autonomous driving perception, surveillance tasks, wearable devices or IoT networks. Object recognition using video data is more challenging than using still images due to blur, occlusions or rare object poses. Specific video detectors with high computational cost or standard image detectors together with a fast post-processing algorithm achieve the current state-of-the-art. This work introduces a novel post-processing pipeline that overcomes some of the limitations of previous post-processing methods by introducing a learning-based similarity evaluation between detections across frames. Our method improves the results of state-of-the-art specific video detectors, specially regarding fast moving objects, and presents low resource requirements. And applied to efficient still image detectors, such as YOLO, provides comparable results to much more computationally intensive detectors.