论文标题
通过合并伪引用和更少的重新排序来改善同时翻译
Improving Simultaneous Translation by Incorporating Pseudo-References with Fewer Reorderings
论文作者
论文摘要
同时翻译与全句翻译大不相同,因为它在源句子结束之前就开始翻译,只有几个单词延迟。但是,由于缺乏大规模,高质量的同时翻译数据集,大多数此类系统仍在传统的全句子bitexts上进行培训。由于这些bitexts中不必要的长距离重新排序,这对于同时场景而言是理想的。我们提出了一种新颖的方法,该方法将现有全句道的目标侧重写为同时式翻译。在Zh-> en和Ja-> en的同时翻译上进行的实验显示了这些产生的伪引用的实质性改进(最高+2.7 bleu)。
Simultaneous translation is vastly different from full-sentence translation, in the sense that it starts translation before the source sentence ends, with only a few words delay. However, due to the lack of large-scale, high-quality simultaneous translation datasets, most such systems are still trained on conventional full-sentence bitexts. This is far from ideal for the simultaneous scenario due to the abundance of unnecessary long-distance reorderings in those bitexts. We propose a novel method that rewrites the target side of existing full-sentence corpora into simultaneous-style translation. Experiments on Zh->En and Ja->En simultaneous translation show substantial improvements (up to +2.7 BLEU) with the addition of these generated pseudo-references.