NLG自动评估指标的社会偏见

论文标题

NLG自动评估指标的社会偏见

Social Biases in Automatic Evaluation Metrics for NLG

论文作者

Gao, Mingqi, Wan, Xiaojun

论文摘要

许多研究表明，NLP中特定下游任务的单词嵌入，语言模型和模型容易出现社会偏见，尤其是性别偏见。最近，这些技术已逐渐应用于文本生成的自动评估指标。在本文中，我们提出了一种基于单词嵌入协会测试（WEAT）和句子嵌入协会测试（SEAT）的评估方法，以量化评估指标中的社交偏见，并发现社会偏见在某些基于模型的自动评估指标中也广泛存在。此外，我们构建了性别交换的元评估数据集，以探讨性别偏见在图像标题和文本摘要任务中的潜在影响。结果表明，在评估中，给定性别中立的参考文献，基于模型的评估指标可能表明对男性假设的偏爱，并且它们的表现，即评估指标与人类判断之间的相关性通常在性别交换后具有更大的差异。

Many studies have revealed that word embeddings, language models, and models for specific downstream tasks in NLP are prone to social biases, especially gender bias. Recently these techniques have been gradually applied to automatic evaluation metrics for text generation. In the paper, we propose an evaluation method based on Word Embeddings Association Test (WEAT) and Sentence Embeddings Association Test (SEAT) to quantify social biases in evaluation metrics and discover that social biases are also widely present in some model-based automatic evaluation metrics. Moreover, we construct gender-swapped meta-evaluation datasets to explore the potential impact of gender bias in image caption and text summarization tasks. Results show that given gender-neutral references in the evaluation, model-based evaluation metrics may show a preference for the male hypothesis, and the performance of them, i.e. the correlation between evaluation metrics and human judgments, usually has more significant variation after gender swapping.

下载PDF全文

下载文献需遵守相关版权规定

论文标题