情感行为分析使用验证模型和面部先验的模型

论文标题

情感行为分析使用验证模型和面部先验的模型

Affective Behaviour Analysis Using Pretrained Model with Facial Priori

论文作者

Li, Yifan, Sun, Haomiao, Liu, Zhaori, Han, Hu

论文摘要

由于其广泛的应用，情感行为分析引起了研究人员的关注。但是，获得大量面部图像的准确注释是详尽的。因此，我们建议通过在未标记的脸部图像上预处理的胶带自动编码器（MAE）利用先前的面部信息。此外，我们结合了MAE预验证的视觉变压器（VIT）和AffectNet预处理的CNN，以执行多任务情绪识别。我们注意到表达和动作单元（AU）得分是价呈现（VA）回归的纯且完整的特征。结果，我们利用AffectNet预处理的CNN提取与表达和来自VIT的AU得分相连的表达得分来获得最终的VA特征。此外，我们还提出了一个与两个平行的MAE预处理的VIT进行表达识别任务的共同训练框架。为了使这两个视图独立，我们在训练过程中随机掩盖了大多数补丁。然后，执行JS差异以使两种视图的预测尽可能一致。 ABAW4上的结果表明我们的方法是有效的。

Affective behaviour analysis has aroused researchers' attention due to its broad applications. However, it is labor exhaustive to obtain accurate annotations for massive face images. Thus, we propose to utilize the prior facial information via Masked Auto-Encoder (MAE) pretrained on unlabeled face images. Furthermore, we combine MAE pretrained Vision Transformer (ViT) and AffectNet pretrained CNN to perform multi-task emotion recognition. We notice that expression and action unit (AU) scores are pure and intact features for valence-arousal (VA) regression. As a result, we utilize AffectNet pretrained CNN to extract expression scores concatenating with expression and AU scores from ViT to obtain the final VA features. Moreover, we also propose a co-training framework with two parallel MAE pretrained ViT for expression recognition tasks. In order to make the two views independent, we random mask most patches during the training process. Then, JS divergence is performed to make the predictions of the two views as consistent as possible. The results on ABAW4 show that our methods are effective.

下载PDF全文

下载文献需遵守相关版权规定

论文标题