论文标题
Quak:韩国英语神经机器翻译的合成质量估计数据集
QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation
论文作者
论文摘要
随着神经机器翻译的最新进展表明其重要性,对质量估计(QE)的研究一直在稳步发展。量化宽松的目的是在没有参考句子的情况下自动预测机器翻译(MT)输出的质量。尽管它在现实世界中具有很高的效用,但仍然存在有关手动量化量化宽松数据创建的几个局限性:由于需要翻译专家的需求以及数据扩展和语言扩展的问题,不可避免地会产生非平凡的成本。为了应对这些限制,我们提出了Quak,这是一种以全自动方式生成的韩国 - 英语合成量化QE数据集。这包括三个子Quak数据集Quak-M,Quak-P和Quak-H,这些数据集是通过三种策略相对不受语言约束而产生的。由于每种策略不需要促进可伸缩性的人类努力,因此对于Quak-P,H和658m的Quak-M,我们将数据扩展到158m。作为实验,我们在进行统计分析的同时定量分析单词级量化宽松量。此外,我们表明,以有效的方式缩放的数据集也通过观察Quak-M中的有意义的性能提高,P时在添加15.8万的数据时可以通过观察有意义的性能提高来改善性能。
With the recent advance in neural machine translation demonstrating its importance, research on quality estimation (QE) has been steadily progressing. QE aims to automatically predict the quality of machine translation (MT) output without reference sentences. Despite its high utility in the real world, there remain several limitations concerning manual QE data creation: inevitably incurred non-trivial costs due to the need for translation experts, and issues with data scaling and language expansion. To tackle these limitations, we present QUAK, a Korean-English synthetic QE dataset generated in a fully automatic manner. This consists of three sub-QUAK datasets QUAK-M, QUAK-P, and QUAK-H, produced through three strategies that are relatively free from language constraints. Since each strategy requires no human effort, which facilitates scalability, we scale our data up to 1.58M for QUAK-P, H and 6.58M for QUAK-M. As an experiment, we quantitatively analyze word-level QE results in various ways while performing statistical analysis. Moreover, we show that datasets scaled in an efficient way also contribute to performance improvements by observing meaningful performance gains in QUAK-M, P when adding data up to 1.58M.