论文标题

不同分裂标准对语音情感识别表现的影响

Effect of different splitting criteria on the performance of speech emotion recognition

论文作者

Atmaja, Bagus Tris, Sasou, Akira

论文摘要

传统的言语情感识别(SER)评估仅在与说话者无关的条件下进行;他们中的一些人甚至没有在这种情况下评估他们的结果。本文强调了通过脚本(称为句子开张或与文本无关的标准)分割训练和测试数据的重要性。结果表明,采用句子打开标准降低了Ser的表现。这一发现意味着在声学信息中嵌入的不同语言信息中识别语音情绪的困难。令人惊讶的是,与文本无关的标准始终比扬声器+与文本无关的标准更差。从最困难到最简单的表演划分标准的全部困难是与文本独立的,扬声器+文本独立的,无独立的和扬声器+文本依赖性的。说话者+与文本无关和独立文本之间的差距小于其他标准,从而加强了在不同句子中识别言语情绪的困难。

Traditional speech emotion recognition (SER) evaluations have been performed merely on a speaker-independent condition; some of them even did not evaluate their result on this condition. This paper highlights the importance of splitting training and test data for SER by script, known as sentence-open or text-independent criteria. The results show that employing sentence-open criteria degraded the performance of SER. This finding implies the difficulties of recognizing emotion from speech in different linguistic information embedded in acoustic information. Surprisingly, text-independent criteria consistently performed worse than speaker+text-independent criteria. The full order of difficulties for splitting criteria on SER performances from the most difficult to the easiest is text-independent, speaker+text-independent, speaker-independent, and speaker+text-dependent. The gap between speaker+text-independent and text-independent was smaller than other criteria, strengthening the difficulties of recognizing emotion from speech in different sentences.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源