通过时空CNN的信息丰富的信息采样技术用于视频中的人类行为分类

论文标题

通过时空CNN的信息丰富的信息采样技术用于视频中的人类行为分类

An Information-rich Sampling Technique over Spatio-Temporal CNN for Classification of Human Actions in Videos

论文作者

Basha, S. H. Shabbeer, Pulabaigari, Viswanath, Mukherjee, Snehasis

论文摘要

我们使用基于3维的卷积神经网络（3D CNN）分类器提出了一种新型的视频中人类动作识别方案。传统上，在基于深度学习的人类活动识别方法中，要进行几个随机框架或视频的每一个$ k^{th} $框架用于训练3D CNN，其中$ k $是一个很小的积极整数，例如4、5或6。这种样品的样品可以减少输入数据的数量，从而加快了网络训练的范围，从而避免了范围的范围，从而可以增强范围的范围，从而可以增强某些范围，从而可以使某些范围的范围降低，从而降低了某些范围，从而避免了某些范围，从而避免了某些范围的范围，从而避免了某些范围的范围。在拟议的视频采样技术中，视频的连续$ k $帧通过计算$ k $帧的高斯加权总和来汇总到单个帧中。所得框架（聚合框架）比常规方法更好地保留信息，并实验表明表现更好。在本文中，提出了一个3D CNN结构来提取时空特征，并遵循长期记忆（LSTM）以识别人类行为。提出的3D CNN体系结构能够处理与表演者距离距离距离的视频。实验是使用KTH和Weizmann人类动作数据集进行的，从而证明它可以通过最新技术产生可比的结果。

We propose a novel scheme for human action recognition in videos, using a 3-dimensional Convolutional Neural Network (3D CNN) based classifier. Traditionally in deep learning based human activity recognition approaches, either a few random frames or every $k^{th}$ frame of the video is considered for training the 3D CNN, where $k$ is a small positive integer, like 4, 5, or 6. This kind of sampling reduces the volume of the input data, which speeds-up training of the network and also avoids over-fitting to some extent, thus enhancing the performance of the 3D CNN model. In the proposed video sampling technique, consecutive $k$ frames of a video are aggregated into a single frame by computing a Gaussian-weighted summation of the $k$ frames. The resulting frame (aggregated frame) preserves the information in a better way than the conventional approaches and experimentally shown to perform better. In this paper, a 3D CNN architecture is proposed to extract the spatio-temporal features and follows Long Short-Term Memory (LSTM) to recognize human actions. The proposed 3D CNN architecture is capable of handling the videos where the camera is placed at a distance from the performer. Experiments are performed with KTH and WEIZMANN human actions datasets, whereby it is shown to produce comparable results with the state-of-the-art techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题