论文标题
深度学习时代的单词一致性:教程
Word Alignment in the Era of Deep Learning: A Tutorial
论文作者
论文摘要
尽管在统计机器翻译时代(SMT)的时代,Alignment任务虽然突出,但仍是利基市场,并且今天还没有探索。在这两个部分的教程中,我们主张与单词一致性的持续相关性。第一部分为单词对齐方式提供了历史背景,作为传统SMT管道的核心组成部分。我们对Giza ++的零命名,这是一个无监督的,统计的单词对准器,其长寿令人惊讶。向前迈进了神经机器翻译时代(NMT),我们展示了单词一致性的洞察力如何激发当今NMT基本的注意力机制。第二部分转向调查方法。我们介绍神经单词对齐器,显示出缓慢而稳定的进步,以超过Giza ++的性能。最后,我们介绍了单词一致性的当今应用,从跨语义注释投影到改进翻译。
The word alignment task, despite its prominence in the era of statistical machine translation (SMT), is niche and under-explored today. In this two-part tutorial, we argue for the continued relevance for word alignment. The first part provides a historical background to word alignment as a core component of the traditional SMT pipeline. We zero-in on GIZA++, an unsupervised, statistical word aligner with surprising longevity. Jumping forward to the era of neural machine translation (NMT), we show how insights from word alignment inspired the attention mechanism fundamental to present-day NMT. The second part shifts to a survey approach. We cover neural word aligners, showing the slow but steady progress towards surpassing GIZA++ performance. Finally, we cover the present-day applications of word alignment, from cross-lingual annotation projection, to improving translation.