论文标题
多模式表示基于共同信息最大化,最小化和身份嵌入多模式情感分析的学习
Multimodal Representations Learning Based on Mutual Information Maximization and Minimization and Identity Embedding for Multimodal Sentiment Analysis
论文作者
论文摘要
多模式情感分析(MSA)是一个基本的复杂研究问题,因为不同方式之间的异质性差距和人类情感表达的歧义。尽管已经有许多成功的尝试来构建MSA的多模式表示,但仍有两个挑战需要解决:1)需要构建更强大的多模式表示,以弥合异质性差距并应对复杂的多模式相互作用,而2)必须在整个信息流中有效地对上下文动态进行建模。在这项工作中,我们提出了一个基于共同信息最大化,最小化和身份嵌入(MMMIE)的多模式表示模型。我们结合了模态对之间的最大化,以及在输入数据和相应特征之间最小化的相互信息,以挖掘模态不变和与任务相关的信息。此外,提出了身份嵌入,以提示下游网络感知上下文信息。两个公共数据集的实验结果证明了该模型的有效性。
Multimodal sentiment analysis (MSA) is a fundamental complex research problem due to the heterogeneity gap between different modalities and the ambiguity of human emotional expression. Although there have been many successful attempts to construct multimodal representations for MSA, there are still two challenges to be addressed: 1) A more robust multimodal representation needs to be constructed to bridge the heterogeneity gap and cope with the complex multimodal interactions, and 2) the contextual dynamics must be modeled effectively throughout the information flow. In this work, we propose a multimodal representation model based on Mutual information Maximization and Minimization and Identity Embedding (MMMIE). We combine mutual information maximization between modal pairs, and mutual information minimization between input data and corresponding features to mine the modal-invariant and task-related information. Furthermore, Identity Embedding is proposed to prompt the downstream network to perceive the contextual information. Experimental results on two public datasets demonstrate the effectiveness of the proposed model.