论文标题

与单语言数据的机器翻译的快速域适应

Rapid Domain Adaptation for Machine Translation with Monolingual Data

论文作者

Mahdieh, Mahdis, Chen, Mia Xu, Cao, Yuan, Firat, Orhan

论文摘要

机器翻译的一个挑战是,面对诸如Covid-19之类的激增事件,如何快速适应看不见的域,在这种情况下,及时且准确地将内域信息转换为多种语言至关重要,但尚不可用的数据。在本文中,我们提出了一种方法,该方法可以从无监督翻译的角度来实现快速域的适应。我们提出的方法仅需要内域单语言数据,并且可以快速应用于在通用域进行训练的先前的翻译系统,从而达到了域内翻译质量的显着增长,几乎没有或根本没有一般域的下降。我们还提出了同时适应多个领域和语言的有效程序。据我们所知,这是旨在解决无监督的多语言领域适应的第一次尝试。

One challenge of machine translation is how to quickly adapt to unseen domains in face of surging events like COVID-19, in which case timely and accurate translation of in-domain information into multiple languages is critical but little parallel data is available yet. In this paper, we propose an approach that enables rapid domain adaptation from the perspective of unsupervised translation. Our proposed approach only requires in-domain monolingual data and can be quickly applied to a preexisting translation system trained on general domain, reaching significant gains on in-domain translation quality with little or no drop on general-domain. We also propose an effective procedure of simultaneous adaptation for multiple domains and languages. To the best of our knowledge, this is the first attempt that aims to address unsupervised multilingual domain adaptation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源