使用像素序列的图像分类

论文标题

使用像素序列的图像分类

Image Classification using Sequence of Pixels

论文作者

Kuldeep, Gajraj

论文摘要

这项研究比较了基于复发神经网络的顺序图像分类方法。我们描述了基于复发性神经网络的方法，例如长期 - 长期记忆（LSTM），双向长短记忆（BilstM）体系结构等。我们还回顾了最新的顺序图像分类体系结构。我们主要关注研究中的LSTM，Bilstm，时间卷积网络和独立的复发性神经网络体系结构。众所周知，RNN缺乏在输入序列中学习长期依赖性。我们在输入序列上使用正交Ramanujan周期变换使用简单的特征构造方法。实验表明，如果将这些功能赋予LSTM或Bilstm网络，则性能会大大提高。我们在这项研究上的重点是同时提高训练精度，以减少LSTM和BilstM体系结构的训练时间，但并非推动最先进的结果，因此我们使用简单的LSTM/BILSTM体系结构。我们将顺序输入与构造功能作为MNIST和CIFAR数据集的单层LSTM和BILSTM网络的输入进行了比较。我们观察到对LSTM网络进行的顺序输入进行了128个对五个时期的隐藏单元训练的训练精度为33％，而构造的功能作为相同LSTM网络的输入，导致训练精度为90％，时间减少1/3。

This study compares sequential image classification methods based on recurrent neural networks. We describe methods based on recurrent neural networks such as Long-Short-Term memory(LSTM), bidirectional Long-Short-Term memory(BiLSTM) architectures, etc. We also review the state-of-the-art sequential image classification architectures. We mainly focus on LSTM, BiLSTM, temporal convolution network, and independent recurrent neural network architecture in the study. It is known that RNN lacks in learning long-term dependencies in the input sequence. We use a simple feature construction method using orthogonal Ramanujan periodic transform on the input sequence. Experiments demonstrate that if these features are given to LSTM or BiLSTM networks, the performance increases drastically. Our focus in this study is to increase the training accuracy simultaneously reducing the training time for the LSTM and BiLSTM architecture, but not on pushing the state-of-the-art results, so we use simple LSTM/BiLSTM architecture. We compare sequential input with the constructed feature as input to single layer LSTM and BiLSTM network for MNIST and CIFAR datasets. We observe that sequential input to the LSTM network with 128 hidden unit training for five epochs results in training accuracy of 33% whereas constructed features as input to the same LSTM network results in training accuracy of 90% with 1/3 lesser time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题