自我监督的注意力网络和不确定性减轻加权，以识别声带爆发的多任务情绪识别

论文标题

自我监督的注意力网络和不确定性减轻加权，以识别声带爆发的多任务情绪识别

Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts

论文作者

Karas, Vincent, Triantafyllopoulos, Andreas, Song, Meishu, Schuller, Björn W.

论文摘要

声乐爆发在交流情感中起着重要的作用，使它们对于改善语音情感识别很有价值。在这里，我们介绍了我们在ACII情感人声爆发工作室和挑战2022（A-VB）中预测声音爆发并预测其情感意义的方法。我们使用大型的自我监督音频模型作为共享的功能提取器，并比较在分类器链和注意力网络上构建的多个体系结构，并结合不确定性减少减肥策略。我们的方法超过了所有四个任务的挑战基线。

Vocal bursts play an important role in communicating affect, making them valuable for improving speech emotion recognition. Here, we present our approach for classifying vocal bursts and predicting their emotional significance in the ACII Affective Vocal Burst Workshop & Challenge 2022 (A-VB). We use a large self-supervised audio model as shared feature extractor and compare multiple architectures built on classifier chains and attention networks, combined with uncertainty loss weighting strategies. Our approach surpasses the challenge baseline by a wide margin on all four tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题