论文标题
$ m^3 $ t:野外的多模式连续价估计
$M^3$T: Multi-Modal Continuous Valence-Arousal Estimation in the Wild
论文作者
论文摘要
本报告描述了我们提交给我们提交的多模式多任务($ m^3 $ t)方法,该方法是我们提交的,即情感行为分析(野生婚姻(ABAW)挑战)与IEEE国际会议与自动面孔和良好概念的2020年录制功能的IEEE International Chindution结合进行的婚姻挑战(ABAW)挑战的曲目。跟踪估计价和唤醒的轨道。时空视觉特征是用3D卷积网络和双向复发神经网络提取的。考虑到价 /唤醒,情绪和面部动作之间的相关性,我们还探索了从其他任务中受益的机制。我们在ABAW提供的验证集上评估了$ M^3 $ t框架,并且它的表现大大优于基线方法。
This report describes a multi-modal multi-task ($M^3$T) approach underlying our submission to the valence-arousal estimation track of the Affective Behavior Analysis in-the-wild (ABAW) Challenge, held in conjunction with the IEEE International Conference on Automatic Face and Gesture Recognition (FG) 2020. In the proposed $M^3$T framework, we fuse both visual features from videos and acoustic features from the audio tracks to estimate the valence and arousal. The spatio-temporal visual features are extracted with a 3D convolutional network and a bidirectional recurrent neural network. Considering the correlations between valence / arousal, emotions, and facial actions, we also explores mechanisms to benefit from other tasks. We evaluated the $M^3$T framework on the validation set provided by ABAW and it significantly outperforms the baseline method.