论文标题

通过合并伪引用和更少的重新排序来改善同时翻译

Improving Simultaneous Translation by Incorporating Pseudo-References with Fewer Reorderings

论文作者

Chen, Junkun, Zheng, Renjie, Kita, Atsuhito, Ma, Mingbo, Huang, Liang

论文摘要

同时翻译与全句翻译大不相同,因为它在源句子结束之前就开始翻译,只有几个单词延迟。但是,由于缺乏大规模,高质量的同时翻译数据集,大多数此类系统仍在传统的全句子bitexts上进行培训。由于这些bitexts中不必要的长距离重新排序,这对于同时场景而言是理想的。我们提出了一种新颖的方法,该方法将现有全句道的目标侧重写为同时式翻译。在Zh-> en和Ja-> en的同时翻译上进行的实验显示了这些产生的伪引用的实质性改进(最高+2.7 bleu)。

Simultaneous translation is vastly different from full-sentence translation, in the sense that it starts translation before the source sentence ends, with only a few words delay. However, due to the lack of large-scale, high-quality simultaneous translation datasets, most such systems are still trained on conventional full-sentence bitexts. This is far from ideal for the simultaneous scenario due to the abundance of unnecessary long-distance reorderings in those bitexts. We propose a novel method that rewrites the target side of existing full-sentence corpora into simultaneous-style translation. Experiments on Zh->En and Ja->En simultaneous translation show substantial improvements (up to +2.7 BLEU) with the addition of these generated pseudo-references.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源