基于骨架的动作识别的时间图建模

论文标题

基于骨架的动作识别的时间图建模

Temporal Graph Modeling for Skeleton-based Action Recognition

论文作者

Li, Jianan, Xie, Xuemei, Zhao, Zhifu, Cao, Yuhan, Pan, Qingzhe, Shi, Guangming

论文摘要

图形卷积网络（GCN）将骨架数据模型为图形，在基于骨架的动作识别方面已获得了出色的性能。特别是，骨骼序列的时间动态传达了识别任务中的重要信息。对于时间动态建模，基于GCN的方法仅堆叠多层1D本地卷积，以在相邻的时间步长之间提取时间关系。随着许多本地卷积的重复，由于信息稀释，可能会忽略具有非较差时间距离的关键时间信息。因此，这些方法仍然不清楚如何完全探索骨架序列的时间动态。在本文中，我们提出了一个时间增强的图形卷积网络（TE-GCN）来应对此限制。所提出的TE-GCN构建了时间关系图，以捕获复杂的时间动态。具体而言，构造的时间关系图明确地建立了与语义相关的时间特征之间的连接，以模拟相邻时间和非贴上时间步骤之间的时间关系。同时，为了进一步探索足够的时间动态，多头机制旨在研究时间关系的多工。广泛的实验是在两个广泛使用的大型数据集上进行的，即NTU-60 RGB+D和NTU-120 RGB+D。实验结果表明，所提出的模型通过为行动识别的时间建模做出贡献来实现最先进的绩效。

Graph Convolutional Networks (GCNs), which model skeleton data as graphs, have obtained remarkable performance for skeleton-based action recognition. Particularly, the temporal dynamic of skeleton sequence conveys significant information in the recognition task. For temporal dynamic modeling, GCN-based methods only stack multi-layer 1D local convolutions to extract temporal relations between adjacent time steps. With the repeat of a lot of local convolutions, the key temporal information with non-adjacent temporal distance may be ignored due to the information dilution. Therefore, these methods still remain unclear how to fully explore temporal dynamic of skeleton sequence. In this paper, we propose a Temporal Enhanced Graph Convolutional Network (TE-GCN) to tackle this limitation. The proposed TE-GCN constructs temporal relation graph to capture complex temporal dynamic. Specifically, the constructed temporal relation graph explicitly builds connections between semantically related temporal features to model temporal relations between both adjacent and non-adjacent time steps. Meanwhile, to further explore the sufficient temporal dynamic, multi-head mechanism is designed to investigate multi-kinds of temporal relations. Extensive experiments are performed on two widely used large-scale datasets, NTU-60 RGB+D and NTU-120 RGB+D. And experimental results show that the proposed model achieves the state-of-the-art performance by making contribution to temporal modeling for action recognition.

下载PDF全文

下载文献需遵守相关版权规定

论文标题