论文标题
WMT 2020的性别核心和偏见评估
Gender Coreference and Bias Evaluation at WMT 2020
论文作者
论文摘要
当基于伪造性别相关性的性别变形时,机器翻译中的性别偏见会表现出来。例如,始终将医生称为男性和护士为女性。这可能是特别有害的,因为模型变得越来越流行并部署在商业系统中。我们的工作为这一现象提供了最大的证据,其中19个以上的系统以四种不同的目标语言提交了WMT:捷克,德语,波兰语和俄语。为了实现这一目标,我们使用Winomt,这是一个最近的自动测试套件,在翻译为语法性别的语言时,可以检查性别核心和偏见。我们扩展了Winomt,以处理在WMT:Polish和Czech中测试的两种新语言。我们发现所有系统都始终在数据中使用虚假相关性,而不是有意义的上下文信息。
Gender bias in machine translation can manifest when choosing gender inflections based on spurious gender correlations. For example, always translating doctors as men and nurses as women. This can be particularly harmful as models become more popular and deployed within commercial systems. Our work presents the largest evidence for the phenomenon in more than 19 systems submitted to the WMT over four diverse target languages: Czech, German, Polish, and Russian. To achieve this, we use WinoMT, a recent automatic test suite which examines gender coreference and bias when translating from English to languages with grammatical gender. We extend WinoMT to handle two new languages tested in WMT: Polish and Czech. We find that all systems consistently use spurious correlations in the data rather than meaningful contextual information.