用大量注释的数据库分析深泡检测中的公平性

论文标题

用大量注释的数据库分析深泡检测中的公平性

Analyzing Fairness in Deepfake Detection With Massively Annotated Databases

论文作者

Xu, Ying, Terhörst, Philipp, Raja, Kiran, Pedersen, Marius

论文摘要

近年来，使用Deepfake的图像和视频操纵已成为安全和社会的严重关注。已经提出了许多检测模型和数据集来可靠地检测DeepFake数据。但是，人们越来越担心这些模型和培训数据库可能会偏见，因此导致深泡探测器失败。 In this work, we investigate factors causing biased detection in public Deepfake datasets by (a) creating large-scale demographic and non-demographic attribute annotations with 47 different attributes for five popular Deepfake datasets and (b) comprehensively analysing attributes resulting in AI-bias of three state-of-the-art Deepfake detection backbone models on these datasets.该分析表明，各种属性如何影响各种独特属性（超过6500万标签），包括人口统计学（年龄，性别，种族）和非人口统计学（头发，皮肤，配件等）属性。研究数据集的结果表明，多样性有限，更重要的是，表明使用的深层检测骨干模型受到调查的属性的强烈影响，使其在属性之间不公平。对这种不平衡/偏置数据集训练的深泡检测骨干方法导致检测结果不正确，从而导致了普遍性，公平性和安全性问题。我们的发现和注释的数据集将指导未来的研究以评估和减轻深层检测技术的偏见。注释的数据集和相应的代码可公开可用。

In recent years, image and video manipulations with Deepfake have become a severe concern for security and society. Many detection models and datasets have been proposed to detect Deepfake data reliably. However, there is an increased concern that these models and training databases might be biased and, thus, cause Deepfake detectors to fail. In this work, we investigate factors causing biased detection in public Deepfake datasets by (a) creating large-scale demographic and non-demographic attribute annotations with 47 different attributes for five popular Deepfake datasets and (b) comprehensively analysing attributes resulting in AI-bias of three state-of-the-art Deepfake detection backbone models on these datasets. The analysis shows how various attributes influence a large variety of distinctive attributes (from over 65M labels) on the detection performance which includes demographic (age, gender, ethnicity) and non-demographic (hair, skin, accessories, etc.) attributes. The results examined datasets show limited diversity and, more importantly, show that the utilised Deepfake detection backbone models are strongly affected by investigated attributes making them not fair across attributes. The Deepfake detection backbone methods trained on such imbalanced/biased datasets result in incorrect detection results leading to generalisability, fairness, and security issues. Our findings and annotated datasets will guide future research to evaluate and mitigate bias in Deepfake detection techniques. The annotated datasets and the corresponding code are publicly available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题