论文标题
PNL:用金字塔非本地模块提取有效的远程依赖性依赖性,以进行动作识别
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local Module for Action Recognition
论文作者
论文摘要
捕获的远程时空依赖关系在改善视频特征以供行动识别中起着至关重要的作用。受非本地手段启发的非本地块旨在应对这一挑战,并表现出出色的表现。但是,非本地块为原始网络带来了显着增加的计算成本。它也缺乏对视频中区域相关性建模的能力。为了解决上述局限性,我们提出了金字塔非本地(PNL)模块,该模块通过在多个尺度上通过金字塔结构化模块在多个尺度上结合区域相关来扩展非本地块。这扩展通过参与不同区域之间的相互作用来扩大非本地操作的有效性。经验结果证明了我们的PNL模块的有效性和效率,该模块在迷你运动数据集上实现了83.09%的最新性能,与非局部块相比,计算成本降低。
Long-range spatiotemporal dependencies capturing plays an essential role in improving video features for action recognition. The non-local block inspired by the non-local means is designed to address this challenge and have shown excellent performance. However, the non-local block brings significant increase in computation cost to the original network. It also lacks the ability to model regional correlation in videos. To address the above limitations, we propose Pyramid Non-Local (PNL) module, which extends the non-local block by incorporating regional correlation at multiple scales through a pyramid structured module. This extension upscales the effectiveness of non-local operation by attending to the interaction between different regions. Empirical results prove the effectiveness and efficiency of our PNL module, which achieves state-of-the-art performance of 83.09% on the Mini-Kinetics dataset, with decreased computation cost compared to the non-local block.