内部语音通过脑电图信号识别

论文标题

内部语音通过脑电图信号识别

Inner speech recognition through electroencephalographic signals

论文作者

Gasparini, Francesca, Cazzaniga, Elisa, Saibene, Aurora

论文摘要

这项工作着重于从脑电图信号开始的内部语音识别。内部的语音识别被定义为内部化过程，在该过程中，人们以纯粹的含义思考，通常与自己内在的“声音”的听觉图像有关。将脑电图解码为文本应被理解为有限数量的单词（命令）或音素的存在（构成单词的声音单位）的分类。与言语相关的BCI提供了通过从大脑信号解释的语音命令来控制设备的有效声音交流策略，从而改善了通过恢复与环境的沟通来失去说话能力的人们的生活质量。分析了两个公共内部语音数据集。使用这些数据，从基本方法（例如支持向量机器）开始研究和实现某些分类模型，然后再到集合方法，例如极端梯度增强分类器，直到使用神经网络（例如长期短期内存（LSTM）和双向长期短期内存（BILSTM））。使用LSTM和BILSTM模型，通常在内部语音识别的文献中不使用，因此可以获得与ART中存在或优越的。

This work focuses on inner speech recognition starting from EEG signals. Inner speech recognition is defined as the internalized process in which the person thinks in pure meanings, generally associated with an auditory imagery of own inner "voice". The decoding of the EEG into text should be understood as the classification of a limited number of words (commands) or the presence of phonemes (units of sound that make up words). Speech-related BCIs provide effective vocal communication strategies for controlling devices through speech commands interpreted from brain signals, improving the quality of life of people who have lost the capability to speak, by restoring communication with their environment. Two public inner speech datasets are analysed. Using this data, some classification models are studied and implemented starting from basic methods such as Support Vector Machines, to ensemble methods such as the eXtreme Gradient Boosting classifier up to the use of neural networks such as Long Short Term Memory (LSTM) and Bidirectional Long Short Term Memory (BiLSTM). With the LSTM and BiLSTM models, generally not used in the literature of inner speech recognition, results in line with or superior to those present in the stateof-the-art are obtained.

下载PDF全文

下载文献需遵守相关版权规定

论文标题