论文标题

更公平的NLP的扰动增大

Perturbation Augmentation for Fairer NLP

论文作者

Qian, Rebecca, Ross, Candace, Fernandes, Jude, Smith, Eric, Kiela, Douwe, Williams, Adina

论文摘要

在NLP研究中,不需要和经常有害的社会偏见变得越来越突出,影响了模型和数据集。在这项工作中,我们询问有关受人口扰动数据的培训是否导致更公平的语言模型。我们收集了人类注释的文本扰动的大量数据集并训练神经扰动模型,我们显示出优胜的启发式替代方法。我们发现(i)在人口统计学扰动的语料库中预先训练的语言模型(LMS)通常更公平,并且(ii)在扰动的胶水数据集中进行的LMS对下游任务表现出较少的人口统计学偏见,并且(iii)公平性改进并不是在下游任务上表现出绩效。最后,我们讨论了如何最好地评估大型语言模型的(联合国)公平性的杰出问题。我们希望对神经受扰动的这种探索将有助于提高更公平的NLP。

Unwanted and often harmful social biases are becoming ever more salient in NLP research, affecting both models and datasets. In this work, we ask whether training on demographically perturbed data leads to fairer language models. We collect a large dataset of human annotated text perturbations and train a neural perturbation model, which we show outperforms heuristic alternatives. We find that (i) language models (LMs) pre-trained on demographically perturbed corpora are typically more fair, and (ii) LMs finetuned on perturbed GLUE datasets exhibit less demographic bias on downstream tasks, and (iii) fairness improvements do not come at the expense of performance on downstream tasks. Lastly, we discuss outstanding questions about how best to evaluate the (un)fairness of large language models. We hope that this exploration of neural demographic perturbation will help drive more improvement towards fairer NLP.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源