在神经机器翻译培训目标中使用上下文

论文标题

在神经机器翻译培训目标中使用上下文

Using Context in Neural Machine Translation Training Objectives

论文作者

Saunders, Danielle, Stahlberg, Felix, Byrne, Bill

论文摘要

我们使用具有批处理级文档的文档级指标介绍神经机器翻译（NMT）培训。 NMT训练的先前序列对象方法仅关注句子级指标，例如句子bleu，这些指标与所需的评估指标不符，通常是文档BLEU。同时，对文档级NMT培训的研究集中在数据或模型架构上，而不是培训程序。我们发现，这些研究线中的每一个都有一个清晰的空间，并建议将它们与允许文档级评估指标用于NMT培训目标的方案合并。我们首先从句子样本中示例伪用户。然后，我们将预期的文档BLEU梯度和蒙特卡洛采样用于最低风险培训（MRT）的成本功能。这种两级抽样程序可在序列MRT和最大样子训练上获得NMT性能的提高。我们证明，培训对文档级指标比序列指标更强大。我们进一步证明了使用GLEU使用TER和语法误差校正（GEC）对NMT的改进，这两个指标均在文档级别用于评估。

We present Neural Machine Translation (NMT) training using document-level metrics with batch-level documents. Previous sequence-objective approaches to NMT training focus exclusively on sentence-level metrics like sentence BLEU which do not correspond to the desired evaluation metric, typically document BLEU. Meanwhile research into document-level NMT training focuses on data or model architecture rather than training procedure. We find that each of these lines of research has a clear space in it for the other, and propose merging them with a scheme that allows a document-level evaluation metric to be used in the NMT training objective. We first sample pseudo-documents from sentence samples. We then approximate the expected document BLEU gradient with Monte Carlo sampling for use as a cost function in Minimum Risk Training (MRT). This two-level sampling procedure gives NMT performance gains over sequence MRT and maximum-likelihood training. We demonstrate that training is more robust for document-level metrics than with sequence metrics. We further demonstrate improvements on NMT with TER and Grammatical Error Correction (GEC) using GLEU, both metrics used at the document level for evaluations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题