检测出现的交叉偏见：上下文化的单词嵌入包含类似人类偏见的分布

论文标题

检测出现的交叉偏见：上下文化的单词嵌入包含类似人类偏见的分布

Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases

论文作者

Guo, Wei, Caliskan, Aylin

论文摘要

在开始的起点是，隐式人类偏见反映在语言的统计规律性中，可以测量英语静态词嵌入中的偏见。最新的神经语言模型会生成动态单词嵌入，取决于单词出现的上下文。当前方法测量了在句子模板定义的特定上下文中出现的预定义的社会和交叉偏见。通过模板分配模板，我们介绍了上下文化的嵌入关联测试（CEAT），可以通过结合随机效应模型来总结神经语言模型中总体偏见的幅度。关于社会和交叉偏见的实验表明，Ceat找到了所有测试偏见的证据，并提供了有关在不同情况下相同偏见的效果方差的全面信息。我们研究的所有在英语语料库中培训的模型都包含有偏见的表示形式。此外，我们开发了两种方法：交叉偏置检测（IBD）和新兴的相互偏置检测（EIBD），以自动识别除了在上下文词嵌入中测量它们外，还从静态单词嵌入的静态词嵌入中识别出静态单词嵌入的偏置偏置。我们介绍了第一个算法偏见检测结果，涉及交叉组成员如何与独特的新兴偏见密切相关，这些偏见与其组成少数族裔身份的偏见没有重叠。当检测非裔美国女性和墨西哥裔美国女性的交叉和新兴偏见时，IBD和EIBD具有很高的精度。我们的结果表明，与多个少数群体的成员（例如非裔美国女性和墨西哥裔美国女性）相关的种族和性别交集的偏见，在所有神经语言模型中均具有最高的幅度。

With the starting point that implicit human biases are reflected in the statistical regularities of language, it is possible to measure biases in English static word embeddings. State-of-the-art neural language models generate dynamic word embeddings dependent on the context in which the word appears. Current methods measure pre-defined social and intersectional biases that appear in particular contexts defined by sentence templates. Dispensing with templates, we introduce the Contextualized Embedding Association Test (CEAT), that can summarize the magnitude of overall bias in neural language models by incorporating a random-effects model. Experiments on social and intersectional biases show that CEAT finds evidence of all tested biases and provides comprehensive information on the variance of effect magnitudes of the same bias in different contexts. All the models trained on English corpora that we study contain biased representations. Furthermore, we develop two methods, Intersectional Bias Detection (IBD) and Emergent Intersectional Bias Detection (EIBD), to automatically identify the intersectional biases and emergent intersectional biases from static word embeddings in addition to measuring them in contextualized word embeddings. We present the first algorithmic bias detection findings on how intersectional group members are strongly associated with unique emergent biases that do not overlap with the biases of their constituent minority identities. IBD and EIBD achieve high accuracy when detecting the intersectional and emergent biases of African American females and Mexican American females. Our results indicate that biases at the intersection of race and gender associated with members of multiple minority groups, such as African American females and Mexican American females, have the highest magnitude across all neural language models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题