使用主题感知的空间注意力的先验增强的时间动作定位

论文标题

使用主题感知的空间注意力的先验增强的时间动作定位

Prior-enhanced Temporal Action Localization using Subject-aware Spatial Attention

论文作者

Liu, Yifan, Tang, Youbao, Zhang, Ning, Lin, Ruei-Sung, Wang, Haoqian

论文摘要

时间动作定位（TAL）旨在检测边界并在长期未经修剪的视频中识别每个动作实例的类别。当前的方法同质处理视频帧，并倾向于给予背景和关键对象过多的关注。这限制了他们对定位行动边界的敏感性。为此，我们提出了一种先验增强的时间动作定位方法（Petal），该方法仅接受RGB输入，并将作用主体作为先验。该建议利用插件主题感知的空间注意模块（SA-SAM）来利用动作主体的信息来生成汇总和主题优先的表示。 Thumos-14和ActivityNet-1.3数据集的实验结果表明，所提出的花瓣仅使用RGB功能，例如，在使用RGB特征或Thumos-14数据集中使用RGB功能或具有其他光流功能的最先进方法，将MAP提高2.41％或0.25％。

Temporal action localization (TAL) aims to detect the boundary and identify the class of each action instance in a long untrimmed video. Current approaches treat video frames homogeneously, and tend to give background and key objects excessive attention. This limits their sensitivity to localize action boundaries. To this end, we propose a prior-enhanced temporal action localization method (PETAL), which only takes in RGB input and incorporates action subjects as priors. This proposal leverages action subjects' information with a plug-and-play subject-aware spatial attention module (SA-SAM) to generate an aggregated and subject-prioritized representation. Experimental results on THUMOS-14 and ActivityNet-1.3 datasets demonstrate that the proposed PETAL achieves competitive performance using only RGB features, e.g., boosting mAP by 2.41% or 0.25% over the state-of-the-art approach that uses RGB features or with additional optical flow features on the THUMOS-14 dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题