VIC博士：视频个人计数的分解和推理

论文标题

VIC博士：视频个人计数的分解和推理

DR.VIC: Decomposition and Reasoning for Video Individual Counting

论文作者

Han, Tao, Bai, Lei, Gao, Junyu, Wang, Qi, Ouyang, Wanli

论文摘要

行人计数是理解行人模式和人群流量分析的基本工具。现有的作品（例如，图像级的行人计数，跨线人群计数等）要么仅关注图像级计数，要么被限制在线路的手动注释上。在这项工作中，我们建议从新的角度进行行人计数 - 视频个人计数（VIC），该视频计算给定视频中个别行人的总数（一个人只计算一次）。我们建议通过将所有行人分解为第一帧中存在的最初的行人和新的行人，而不是依靠多个对象跟踪（MOT）技术（MOT）技术来解决问题。然后，端到端的分解和推理网络（DRNET）旨在通过密度估计方法预测初始的行人计数，以及原因是每个框架的新行人计数，并具有可区分的最佳传输。在两个数据集上进行了广泛的实验，这些数据集与拥挤的行人和不同的场景进行了实验，证明了我们方法比基线的有效性，在计算单个行人方面具有极大的优势。代码：https：//github.com/taohan10200/drnet。

Pedestrian counting is a fundamental tool for understanding pedestrian patterns and crowd flow analysis. Existing works (e.g., image-level pedestrian counting, crossline crowd counting et al.) either only focus on the image-level counting or are constrained to the manual annotation of lines. In this work, we propose to conduct the pedestrian counting from a new perspective - Video Individual Counting (VIC), which counts the total number of individual pedestrians in the given video (a person is only counted once). Instead of relying on the Multiple Object Tracking (MOT) techniques, we propose to solve the problem by decomposing all pedestrians into the initial pedestrians who existed in the first frame and the new pedestrians with separate identities in each following frame. Then, an end-to-end Decomposition and Reasoning Network (DRNet) is designed to predict the initial pedestrian count with the density estimation method and reason the new pedestrian's count of each frame with the differentiable optimal transport. Extensive experiments are conducted on two datasets with congested pedestrians and diverse scenes, demonstrating the effectiveness of our method over baselines with great superiority in counting the individual pedestrians. Code: https://github.com/taohan10200/DRNet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题