论文标题
评估神经机器翻译模型学到的双语知识
Assessing the Bilingual Knowledge Learned by Neural Machine Translation Models
论文作者
论文摘要
机器翻译(MT)系统通过从培训示例中自动学习双语词典,语法和语义的深入了解,从而在不同语言之间翻译文本。尽管神经机器翻译(NMT)领导了MT领域,但我们对它的工作方式和原因有很不差的了解。在本文中,我们通过评估NMT模型使用短语表所学的双语知识(可解释的双语词典表的可解释表)来弥合差距。我们从NMT模型正确预测的训练示例中提取短语表。广泛使用数据集的广泛实验表明,对语言对和随机种子的短语表是合理的,并且一致。配备了可解释的短语表,我们发现NMT模型从训练示例中学习了从简单到复杂的模式,并提炼了基本的双语知识。我们还重新审视了一些可能影响双语知识学习(例如背面翻译)的进步,并报告一些有趣的发现。我们认为,这项工作为使用统计模型解释NMT的新角度开辟了一个新的角度,并为改善NMT模型的最新进展提供了经验支持。
Machine translation (MT) systems translate text between different languages by automatically learning in-depth knowledge of bilingual lexicons, grammar and semantics from the training examples. Although neural machine translation (NMT) has led the field of MT, we have a poor understanding on how and why it works. In this paper, we bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table -- an interpretable table of bilingual lexicons. We extract the phrase table from the training examples that an NMT model correctly predicts. Extensive experiments on widely-used datasets show that the phrase table is reasonable and consistent against language pairs and random seeds. Equipped with the interpretable phrase table, we find that NMT models learn patterns from simple to complex and distill essential bilingual knowledge from the training examples. We also revisit some advances that potentially affect the learning of bilingual knowledge (e.g., back-translation), and report some interesting findings. We believe this work opens a new angle to interpret NMT with statistic models, and provides empirical supports for recent advances in improving NMT models.