论文标题
测量斯堪的纳维亚语言模型中的有害表示
Measuring Harmful Representations in Scandinavian Language Models
论文作者
论文摘要
在性别平等方面,斯堪的纳维亚国家被视为角色模型。随着预训练的语言模型的出现及其广泛的用法,我们研究了选定的斯堪的纳维亚语言模型中基于性别的有害和有毒内容在多大程度上存在。我们通过手动创建基于模板的句子并探索完成模型来检查九种模型,涵盖丹麦,瑞典语和挪威语。我们使用两种测量有害和有毒完成的方法评估完成,并对结果进行详尽的分析。我们表明,斯堪的纳维亚预培训的语言模型包含有害和基于性别的刻板印象,所有语言的价值都相似。这一发现与斯堪的纳维亚国家的性别平等相关的一般期望违背,并显示了在现实世界中使用此类模型的可能性结果。
Scandinavian countries are perceived as role-models when it comes to gender equality. With the advent of pre-trained language models and their widespread usage, we investigate to what extent gender-based harmful and toxic content exist in selected Scandinavian language models. We examine nine models, covering Danish, Swedish, and Norwegian, by manually creating template-based sentences and probing the models for completion. We evaluate the completions using two methods for measuring harmful and toxic completions and provide a thorough analysis of the results. We show that Scandinavian pre-trained language models contain harmful and gender-based stereotypes with similar values across all languages. This finding goes against the general expectations related to gender equality in Scandinavian countries and shows the possible problematic outcomes of using such models in real-world settings.