论文标题
通过多模式监督和潜在领域的适应来增强音乐引起的脑电图的情感表示
Enhancing Affective Representations of Music-Induced EEG through Multimodal Supervision and latent Domain Adaptation
论文作者
论文摘要
音乐认知和对音乐的神经反应的研究对于理解人类情绪非常宝贵。但是,大脑信号表现出一种高度复杂的结构,使处理和检索有意义的特征具有挑战性,尤其是诸如情感之类的抽象结构。此外,学习模型的性能受到有限数量的可用神经元数据及其严重的受试者间变异性的破坏。在本文中,我们从音乐聆听过程中从脑电图信号中提取高效,个性化的情感表示。为此,我们采用音乐信号作为脑电图的监督方式,旨在将其语义信号投射到共同的表示空间上。我们通过将基于LSTM的注意模型组合来处理脑电图和音乐标记的预训练模型,以及一个反向域鉴别器来对齐两种模式的分布,从而进一步将学习过程与情感标签限制,我们利用了双模式框架。可以通过将相关的音乐样本提供给EEG输入查询的相关音乐样本,从而直接将最终的框架直接用于情感识别。实验发现显示了通过刺激信息来增强神经元数据的潜力,以识别目的,并洞悉音乐诱导的情感特征的分布和时间差异。
The study of Music Cognition and neural responses to music has been invaluable in understanding human emotions. Brain signals, though, manifest a highly complex structure that makes processing and retrieving meaningful features challenging, particularly of abstract constructs like affect. Moreover, the performance of learning models is undermined by the limited amount of available neuronal data and their severe inter-subject variability. In this paper we extract efficient, personalized affective representations from EEG signals during music listening. To this end, we employ music signals as a supervisory modality to EEG, aiming to project their semantic correspondence onto a common representation space. We utilize a bi-modal framework by combining an LSTM-based attention model to process EEG and a pre-trained model for music tagging, along with a reverse domain discriminator to align the distributions of the two modalities, further constraining the learning process with emotion tags. The resulting framework can be utilized for emotion recognition both directly, by performing supervised predictions from either modality, and indirectly, by providing relevant music samples to EEG input queries. The experimental findings show the potential of enhancing neuronal data through stimulus information for recognition purposes and yield insights into the distribution and temporal variance of music-induced affective features.