多分类者的互动学习，用于模棱两可的语音情感识别

论文标题

多分类者的互动学习，用于模棱两可的语音情感识别

Multi-Classifier Interactive Learning for Ambiguous Speech Emotion Recognition

论文作者

Zhou, Ying, Liang, Xuefeng, Gu, Yu, Yin, Yifei, Yao, Longshan

论文摘要

近年来，语音情感识别技术在呼叫中心，社会机器人和医疗保健等工业应用中具有重要意义。语音识别和语音情感识别的结合可以提高反馈效率和服务质量。因此，在行业和学术上，言语情感的认识引起了很多关注。由于整个话语中存在的情绪可能具有各种概率，因此语音情绪可能是模棱两可的，这给识别任务带来了巨大的挑战。但是，先前的研究通常将单标签或多标签分配给每个话语。因此，由于不适当的表示，它们的算法会导致精确度较低。受最佳互动理论的启发，我们通过提出一种新颖的多分类器互动学习（MCIL）方法来解决模棱两可的语音情绪。在MCIL中，多个不同的分类器首先模仿了几个人，这些人对情绪的认知不一致，并构建了新的模棱两可的标签（情感概率分布）。然后，他们将使用新标签重新训练，以与认知相互作用。此过程使每个分类器能够从他人那里学习更好的模棱两可数据的表示，并进一步提高识别能力。在三个基准语料库（MAS，IEMOCAP和FAU-AIBO）上进行的实验表明，MCIL不仅可以改善每个分类器的性能，而且还提高了他们的识别一致性，从中度到实质性。

In recent years, speech emotion recognition technology is of great significance in industrial applications such as call centers, social robots and health care. The combination of speech recognition and speech emotion recognition can improve the feedback efficiency and the quality of service. Thus, the speech emotion recognition has been attracted much attention in both industry and academic. Since emotions existing in an entire utterance may have varied probabilities, speech emotion is likely to be ambiguous, which poses great challenges to recognition tasks. However, previous studies commonly assigned a single-label or multi-label to each utterance in certain. Therefore, their algorithms result in low accuracies because of the inappropriate representation. Inspired by the optimally interacting theory, we address the ambiguous speech emotions by proposing a novel multi-classifier interactive learning (MCIL) method. In MCIL, multiple different classifiers first mimic several individuals, who have inconsistent cognitions of ambiguous emotions, and construct new ambiguous labels (the emotion probability distribution). Then, they are retrained with the new labels to interact with their cognitions. This procedure enables each classifier to learn better representations of ambiguous data from others, and further improves the recognition ability. The experiments on three benchmark corpora (MAS, IEMOCAP, and FAU-AIBO) demonstrate that MCIL does not only improve each classifier's performance, but also raises their recognition consistency from moderate to substantial.

下载PDF全文

下载文献需遵守相关版权规定

论文标题