论文标题
关于曝光偏见,幻觉和域转移神经机器翻译
On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation
论文作者
论文摘要
神经机译(NMT)中的标准训练算法患有暴露偏置,并且已经提出了替代算法来减轻这种情况。但是,暴露偏见的实际影响正在争论中。在本文中,我们将暴露偏见与NMT的另一个知名问题联系起来,即在域移动下产生幻觉的趋势。在具有多个测试域的三个数据集的实验中,我们表明幻觉是部分责任的,而对避免暴露偏见的最低风险训练的训练可以减轻这种情况。我们的分析解释了为什么在域移动下,暴露偏见更有问题,还将暴露偏见与梁搜索问题(即性能恶化与梁尺寸的增加)联系起来。我们的结果为减少暴露偏见的方法提供了一种新的理由:即使它们在内域测试集上没有提高性能,它们也可以增加模型的鲁棒性到域移动。
The standard training algorithm in neural machine translation (NMT) suffers from exposure bias, and alternative algorithms have been proposed to mitigate this. However, the practical impact of exposure bias is under debate. In this paper, we link exposure bias to another well-known problem in NMT, namely the tendency to generate hallucinations under domain shift. In experiments on three datasets with multiple test domains, we show that exposure bias is partially to blame for hallucinations, and that training with Minimum Risk Training, which avoids exposure bias, can mitigate this. Our analysis explains why exposure bias is more problematic under domain shift, and also links exposure bias to the beam search problem, i.e. performance deterioration with increasing beam size. Our results provide a new justification for methods that reduce exposure bias: even if they do not increase performance on in-domain test sets, they can increase model robustness to domain shift.