论文标题

骨架序列和基于RGB框架的多模式特征融合网络用于动作识别

Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion Network for Action Recognition

论文作者

Zhu, Xiaoguang, Zhu, Ye, Wang, Haoyu, Wen, Honglin, Yan, Yan, Liu, Peilin

论文摘要

动作识别一直是计算机视觉中其在视觉系统中的广泛应用中的一个热门话题。以前的方法通过融合骨架序列和RGB视频的方式来改善。但是,这种方法在RGB视频网络的高复杂性的准确性和效率之间存在困境。为了解决问题,我们提出了一个多模式特征融合网络,以结合骨架序列和RGB框架的模态,而不是RGB视频,因为骨架序列和RGB帧包含的关键信息与骨架序列和RGB视频的组合结合在一起。这样,在较大的边距降低复杂性的同时,保留了互补信息。为了更好地探索两种方式的对应关系,网络中引入了两个阶段的融合框架。在早期融合阶段,我们引入了一个骨骼注意模块,该模块在单个RGB框架上投射了骨架序列,以帮助RGB框架专注于肢体运动区域。在融合阶段晚期,我们提出了一个交叉发音模块,以利用相关性来融合骨骼特征和RGB特征。在两个基准NTU RGB+D和SYSU上进行的实验表明,与最新方法相比,所提出的模型可以实现竞争性能,而降低了网络的复杂性。

Action recognition has been a heated topic in computer vision for its wide application in vision systems. Previous approaches achieve improvement by fusing the modalities of the skeleton sequence and RGB video. However, such methods have a dilemma between the accuracy and efficiency for the high complexity of the RGB video network. To solve the problem, we propose a multi-modality feature fusion network to combine the modalities of the skeleton sequence and RGB frame instead of the RGB video, as the key information contained by the combination of skeleton sequence and RGB frame is close to that of the skeleton sequence and RGB video. In this way, the complementary information is retained while the complexity is reduced by a large margin. To better explore the correspondence of the two modalities, a two-stage fusion framework is introduced in the network. In the early fusion stage, we introduce a skeleton attention module that projects the skeleton sequence on the single RGB frame to help the RGB frame focus on the limb movement regions. In the late fusion stage, we propose a cross-attention module to fuse the skeleton feature and the RGB feature by exploiting the correlation. Experiments on two benchmarks NTU RGB+D and SYSU show that the proposed model achieves competitive performance compared with the state-of-the-art methods while reduces the complexity of the network.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源