基于CTC的非单调的非单调潜在对准非自动回旋机器翻译

论文标题

基于CTC的非单调的非单调潜在对准非自动回旋机器翻译

Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation

论文作者

Shao, Chenze, Feng, Yang

论文摘要

非自动回旋翻译（NAT）模型通常经过跨透明损失进行训练，这迫使模型输出逐字对准目标句子，并将高度惩罚单词位置的微小变化。潜在对齐模型通过将所有单调潜在比对与CTC损失进行边缘化，放大明确的对齐。但是，他们无法处理非单调的对准，因为机器翻译中通常会重新排序全局单词，这是不可忽略的。在这项工作中，我们探索了NAT的非单调潜在对齐。我们将对齐空间扩展到非单调的对齐，以允许重新排序，并进一步考虑与目标句子重叠的所有对齐。我们非单调的对齐方式与目标句子匹配并训练潜在的对准模型，以最大程度地提高非单调匹配的F1得分。关于主要WMT基准测试的广泛实验表明，我们的方法基本上改善了基于CTC的模型的翻译性能。我们的最佳模型仅在WMT14 ende上实现30.06 BLEU，仅使用一题解码，从而缩小了非自动回忆和自动回归模型之间的差距。

Non-autoregressive translation (NAT) models are typically trained with the cross-entropy loss, which forces the model outputs to be aligned verbatim with the target sentence and will highly penalize small shifts in word positions. Latent alignment models relax the explicit alignment by marginalizing out all monotonic latent alignments with the CTC loss. However, they cannot handle non-monotonic alignments, which is non-negligible as there is typically global word reordering in machine translation. In this work, we explore non-monotonic latent alignments for NAT. We extend the alignment space to non-monotonic alignments to allow for the global word reordering and further consider all alignments that overlap with the target sentence. We non-monotonically match the alignments to the target sentence and train the latent alignment model to maximize the F1 score of non-monotonic matching. Extensive experiments on major WMT benchmarks show that our method substantially improves the translation performance of CTC-based models. Our best model achieves 30.06 BLEU on WMT14 En-De with only one-iteration decoding, closing the gap between non-autoregressive and autoregressive models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题