论文标题
将翻译记忆整合到非自动回忆的机器翻译中
Integrating Translation Memories into Non-Autoregressive Machine Translation
论文作者
论文摘要
非自动入学机器翻译(NAT)最近取得了长足的进步。但是,迄今为止,大多数工作都集中在标准翻译任务上,即使某些基于编辑的NAT模型(例如Levenshtein Transformer(LEVT))似乎很适合使用翻译内存(TM)进行翻译。这是这里考虑的情况。我们首先分析了香草Levt模型,并解释了为什么在这种情况下它做得不好。然后,我们提出了一个新的变体TM-Levt,并展示了如何有效训练该模型。通过修改数据显示并引入额外的删除操作,我们可以获得与自回归方法相当的性能,同时减少解码负载。我们还表明,在培训分配期间合并TMS以使用知识蒸馏,这是一种用于减轻多模式问题的知名技巧。
Non-autoregressive machine translation (NAT) has recently made great progress. However, most works to date have focused on standard translation tasks, even though some edit-based NAT models, such as the Levenshtein Transformer (LevT), seem well suited to translate with a Translation Memory (TM). This is the scenario considered here. We first analyze the vanilla LevT model and explain why it does not do well in this setting. We then propose a new variant, TM-LevT, and show how to effectively train this model. By modifying the data presentation and introducing an extra deletion operation, we obtain performance that are on par with an autoregressive approach, while reducing the decoding load. We also show that incorporating TMs during training dispenses to use knowledge distillation, a well-known trick used to mitigate the multimodality issue.