论文标题
离线手写角色识别的先进深度学习体系结构的绩效评估
Performance Evaluation of Advanced Deep Learning Architectures for Offline Handwritten Character Recognition
论文作者
论文摘要
本文介绍了手写的角色识别比较和性能评估,以对不同手写字符进行稳健和精确的分类。该系统通过从原始像素值那里收集特征来利用高级多层神经网络。由于从传统的神经网络中学习复杂的功能非常具有挑战性,因此隐藏的层构成了非线性特征的深层层次结构。使用了两种最先进的深度学习体系结构,其中包括NVIDIA数字中的Caffe Alexnet和Googlenet模型。在两个不同的数据集上对框架进行了训练和测试,以结合多样性和复杂性。其中之一是公开可用的数据集,即包含7705个字符的chars74k,具有上和小写的英语字母以及数字数字。而另一个数据集在本地创建的虽然由4320个字符组成。本地数据集由62个类组成,由40个受试者创建。它还包括上层和小写的英语字母,以及数值数字。总体数据集的培训比例为80%,测试阶段的比率为20%。训练阶段所需的时间约为90分钟。为了验证部分,将获得的结果与地面图进行了比较。 Alexnet达到的准确度水平为77.77%,而Google Net的准确度为88.89%。 Googlenet的较高精度水平是由于其成立模块的独特组合,每个模块都包括在各种尺度上进行集合,卷积和串联程序。
This paper presents a hand-written character recognition comparison and performance evaluation for robust and precise classification of different hand-written characters. The system utilizes advanced multilayer deep neural network by collecting features from raw pixel values. The hidden layers stack deep hierarchies of non-linear features since learning complex features from conventional neural networks is very challenging. Two state of the art deep learning architectures were used which includes Caffe AlexNet and GoogleNet models in NVIDIA DIGITS.The frameworks were trained and tested on two different datasets for incorporating diversity and complexity. One of them is the publicly available dataset i.e. Chars74K comprising of 7705 characters and has upper and lowercase English alphabets, along with numerical digits. While the other dataset created locally consists of 4320 characters. The local dataset consists of 62 classes and was created by 40 subjects. It also consists upper and lowercase English alphabets, along with numerical digits. The overall dataset is divided in the ratio of 80% for training and 20% for testing phase. The time required for training phase is approximately 90 minutes. For validation part, the results obtained were compared with the groundtruth. The accuracy level achieved with AlexNet was 77.77% and 88.89% with Google Net. The higher accuracy level of GoogleNet is due to its unique combination of inception modules, each including pooling, convolutions at various scales and concatenation procedures.