论文标题
人参数的参考可以改善神经机器翻译
Human-Paraphrased References Improve Neural Machine Translation
论文作者
论文摘要
Freitag等人最近提出了将候选翻译与参考翻译的人类产生的释义进行比较的自动评估。当用来代替原始参考文献时,释义版本产生的度量评分与人类判断力更好。这种效果具有各种不同的自动指标,并且倾向于自然配方而不是更字面的(翻译)。在本文中,我们比较了使用标准和释义参考文献进行端到端系统开发的结果。借助最先进的英语 - 德国NMT组件,我们表明,根据人类判断,调整对基于的参考文献会产生一个明显更好的系统,但是在对标准参考测试进行测试时,有5个BLEU点更糟。我们的工作证实了这样的发现,即释义的参考文献得分与人类判断更好地相关,并首次证明将这些分数用于系统开发可以带来重大改进。
Automatic evaluation comparing candidate translations to human-generated paraphrases of reference translations has recently been proposed by Freitag et al. When used in place of original references, the paraphrased versions produce metric scores that correlate better with human judgment. This effect holds for a variety of different automatic metrics, and tends to favor natural formulations over more literal (translationese) ones. In this paper we compare the results of performing end-to-end system development using standard and paraphrased references. With state-of-the-art English-German NMT components, we show that tuning to paraphrased references produces a system that is significantly better according to human judgment, but 5 BLEU points worse when tested on standard references. Our work confirms the finding that paraphrased references yield metric scores that correlate better with human judgment, and demonstrates for the first time that using these scores for system development can lead to significant improvements.