论文标题
在安全检查点使用多个高架摄像机跟踪乘客和行李物品
Tracking Passengers and Baggage Items Using Multiple Overhead Cameras at Security Checkpoints
论文作者
论文摘要
我们介绍了一个新颖的框架,以跟踪高架相机视频中的多个对象,以获取机场检查点安全方案,目标目标与乘客及其行李物品相对应。我们提出了一种自我监督的学习(SSL)技术,以提供有关从间接费用图像中的实例分割不确定性的模型信息。我们的SSL方法通过采用测试时间数据增加和基于回归的旋转不变的伪标签细化技术来改善对象检测。我们的伪标签生成方法提供了多个几何转换的图像作为卷积神经网络(CNN)的输入,将网络产生的增强检测结果回归以减少定位误差,然后使用卑鄙的偏移算法将它们分组。自我监督的检测器模型用于单相机跟踪算法中,以生成目标的时间标识符。我们的方法还结合了多视轨迹关联机制,以保持一致的时间标识符,因为乘客穿越相机视图。在现实的机场检查站环境中从多个间接费用摄像机获得的视频的检测,跟踪和关联性能评估,证明了拟议方法的有效性。我们的结果表明,自我划分将对象检测准确性提高了多达42%,而不会增加模型的推理时间。我们的Multicamera关联方法可实现多达89%的多对象跟踪精度,平均计算时间小于15 ms。
We introduce a novel framework to track multiple objects in overhead camera videos for airport checkpoint security scenarios where targets correspond to passengers and their baggage items. We propose a self-supervised learning (SSL) technique to provide the model information about instance segmentation uncertainty from overhead images. Our SSL approach improves object detection by employing a test-time data augmentation and a regression-based, rotation-invariant pseudo-label refinement technique. Our pseudo-label generation method provides multiple geometrically transformed images as inputs to a convolutional neural network (CNN), regresses the augmented detections generated by the network to reduce localization errors, and then clusters them using the mean-shift algorithm. The self-supervised detector model is used in a single-camera tracking algorithm to generate temporal identifiers for the targets. Our method also incorporates a multiview trajectory association mechanism to maintain consistent temporal identifiers as passengers travel across camera views. An evaluation of detection, tracking, and association performances on videos obtained from multiple overhead cameras in a realistic airport checkpoint environment demonstrates the effectiveness of the proposed approach. Our results show that self-supervision improves object detection accuracy by up to 42% without increasing the inference time of the model. Our multicamera association method achieves up to 89% multiobject tracking accuracy with an average computation time of less than 15 ms.