凝视增强的跨模式嵌入情绪识别

论文标题

凝视增强的跨模式嵌入情绪识别

Gaze-enhanced Crossmodal Embeddings for Emotion Recognition

论文作者

Abdou, Ahmed, Sood, Ekta, Müller, Philipp, Bulling, Andreas

论文摘要

情绪表达是固有的多模式 - 整合面部行为，言语和凝视 - 但它们的自动识别通常仅限于单一模态，例如电话中的讲话。尽管以前的工作提出了跨模式情感嵌入以提高单座识别性能，但尽管它很重要，但不包括明确的凝视代表。我们提出了一种新的情感识别方法，该方法将目光的明确表示形式纳入了跨模式的嵌入框架中。我们表明，我们的方法在流行的一分钟渐进的情感识别数据集上的仅限和视频情感分类都优于先前的艺术状态。此外，我们报告了广泛的消融实验，并提供了有关不同最先进的目光表示和整合策略的绩效的详细见解。我们的结果不仅强调了凝视对情感识别的重要性，而且还展示了一种实用且高效的方法来利用目光信息来执行此任务。

Emotional expressions are inherently multimodal -- integrating facial behavior, speech, and gaze -- but their automatic recognition is often limited to a single modality, e.g. speech during a phone call. While previous work proposed crossmodal emotion embeddings to improve monomodal recognition performance, despite its importance, an explicit representation of gaze was not included. We propose a new approach to emotion recognition that incorporates an explicit representation of gaze in a crossmodal emotion embedding framework. We show that our method outperforms the previous state of the art for both audio-only and video-only emotion classification on the popular One-Minute Gradual Emotion Recognition dataset. Furthermore, we report extensive ablation experiments and provide detailed insights into the performance of different state-of-the-art gaze representations and integration strategies. Our results not only underline the importance of gaze for emotion recognition but also demonstrate a practical and highly effective approach to leveraging gaze information for this task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题