NMT的代词靶向微调，并具有混合损失

论文标题

NMT的代词靶向微调，并具有混合损失

Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses

论文作者

Jwalapuram, Prathyusha, Joty, Shafiq, Shen, Youlin

论文摘要

流行的神经机器翻译模型培训使用诸如倒退的策略来提高BLEU分数，需要大量的额外数据和培训。我们介绍了一类条件生成歧义的混合损失，用于微调训练有素的机器翻译模型。通过有针对性的微调目标和对培训数据的直观重复使用的结合，该模型无法充分学习，我们在不使用任何其他数据的情况下改善了句子级别和上下文模型的模型性能。我们通过微调来针对代词翻译的改进，并在代词基准测试集上评估我们的模型。我们的句子级模型在WMT14和IWSLT13 DE-EN测试集上显示了0.5个BLEU，而我们的上下文模型则取得了最佳结果，从31.81提高到WMT14 DE-EN测试集上的32个BLEU，并从32.10的测试中，并从32.10到33.13 the Iwslt13 de-en Trransersets，并从32.10的测试中提高。我们通过在Fr-en和cs-en上再现了另外两种语言对的改进来进一步显示我们方法的普遍性。代码可在<https://github.com/ntunlp/pronoun-finetuning>中获得。

Popular Neural Machine Translation model training uses strategies like backtranslation to improve BLEU scores, requiring large amounts of additional data and training. We introduce a class of conditional generative-discriminative hybrid losses that we use to fine-tune a trained machine translation model. Through a combination of targeted fine-tuning objectives and intuitive re-use of the training data the model has failed to adequately learn from, we improve the model performance of both a sentence-level and a contextual model without using any additional data. We target the improvement of pronoun translations through our fine-tuning and evaluate our models on a pronoun benchmark testset. Our sentence-level model shows a 0.5 BLEU improvement on both the WMT14 and the IWSLT13 De-En testsets, while our contextual model achieves the best results, improving from 31.81 to 32 BLEU on WMT14 De-En testset, and from 32.10 to 33.13 on the IWSLT13 De-En testset, with corresponding improvements in pronoun translation. We further show the generalizability of our method by reproducing the improvements on two additional language pairs, Fr-En and Cs-En. Code available at <https://github.com/ntunlp/pronoun-finetuning>.

下载PDF全文

下载文献需遵守相关版权规定

论文标题