反作用地衡量和消除视觉训练模型中的社会偏见

论文标题

反作用地衡量和消除视觉训练模型中的社会偏见

Counterfactually Measuring and Eliminating Social Bias in Vision-Language Pre-training Models

论文作者

Zhang, Yi, Wang, Junyang, Sang, Jitao

论文摘要

视觉语言预训练（VLP）模型已在众多跨模式任务中实现了最先进的表现。由于它们经过优化以捕获内模性内和模式间的统计特性，因此仍然存在学习数据中提出的社会偏见的风险。在这项工作中，我们（1）通过比较了事实和反事实样本的[蒙版]预测概率，引入了基于反事实的偏见测量\ emph {contrbias}，以量化VLP模型中的社交偏见；（2）构建一个新型的VL偏置数据集，其中包括24k图像文本对，用于测量VLP模型中的性别偏见，我们从中观察到在VLP模型中存在明显的性别偏见；（3）提出了一种VLP偏数方法\ emph {fairvlp}，以最大程度地减少事实和反事实图像文本对VLP偏见之间的[掩码] ED预测概率的差异。尽管Cunderbias和FaiRVLP专注于社会偏见，但它们可以作为工具，并提供新的见解，以探究和规范VLP模型中的更多知识。

Vision-Language Pre-training (VLP) models have achieved state-of-the-art performance in numerous cross-modal tasks. Since they are optimized to capture the statistical properties of intra- and inter-modality, there remains risk to learn social biases presented in the data as well. In this work, we (1) introduce a counterfactual-based bias measurement \emph{CounterBias} to quantify the social bias in VLP models by comparing the [MASK]ed prediction probabilities of factual and counterfactual samples; (2) construct a novel VL-Bias dataset including 24K image-text pairs for measuring gender bias in VLP models, from which we observed that significant gender bias is prevalent in VLP models; and (3) propose a VLP debiasing method \emph{FairVLP} to minimize the difference in the [MASK]ed prediction probabilities between factual and counterfactual image-text pairs for VLP debiasing. Although CounterBias and FairVLP focus on social bias, they are generalizable to serve as tools and provide new insights to probe and regularize more knowledge in VLP models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题