R2D2：通过替换检测的强大数据对文本

论文标题

R2D2：通过替换检测的强大数据对文本

R2D2: Robust Data-to-Text with Replacement Detection

论文作者

Nan, Linyong, Flores, Lorenzo Jaime Yu, Zhao, Yilun, Liu, Yixin, Benson, Luke, Zou, Weijin, Radev, Dragomir

论文摘要

不忠的文本生成是文本生成系统的常见问题。对于数据到文本（D2T）系统，生成的文本的事实对于任何现实世界中的应用都尤为重要。我们介绍了R2D2，这是一个培训框架，通过培训系统作为生成器和忠实歧视者的培训，以其他替换检测和不可能的学习任务来解决不忠数据之间的生成。为了促进这种培训，我们提出了两种对不忠句子进行取样的方法。我们认为，D2T系统的贫困实体检索能力是不忠的主要来源之一，因此除了现有的指标外，我们还提出了基于NER的指标来评估D2T代的忠诚度。我们的实验结果表明，R2D2系统可以有效地减轻不忠的文本生成，并且在Fetaqa，Logicnlg和Totto上获得了新的最新结果，所有这些结果均具有重大改进。

Unfaithful text generation is a common problem for text generation systems. In the case of Data-to-Text (D2T) systems, the factuality of the generated text is particularly crucial for any real-world applications. We introduce R2D2, a training framework that addresses unfaithful Data-to-Text generation by training a system both as a generator and a faithfulness discriminator with additional replacement detection and unlikelihood learning tasks. To facilitate such training, we propose two methods for sampling unfaithful sentences. We argue that the poor entity retrieval capability of D2T systems is one of the primary sources of unfaithfulness, so in addition to the existing metrics, we further propose NER-based metrics to evaluate the fidelity of D2T generations. Our experimental results show that R2D2 systems could effectively mitigate the unfaithful text generation, and they achieve new state-of-the-art results on FeTaQA, LogicNLG, and ToTTo, all with significant improvements.

下载PDF全文

下载文献需遵守相关版权规定

论文标题