可控文本简化，并具有明确的释义

论文标题

可控文本简化，并具有明确的释义

Controllable Text Simplification with Explicit Paraphrasing

论文作者

Maddela, Mounica, Alva-Manchego, Fernando, Xu, Wei

论文摘要

文本简化通过多种重写转换（例如词汇释义，删除和分裂）来提高句子的可读性。当前的简化系统主要是序列到序列模型，经过训练的端到端，可以同时执行所有这些操作。但是，这样的系统将自己限制在主要删除单词，并且无法轻易适应不同目标受众的要求。在本文中，我们提出了一种新型的混合方法，该方法利用语言动机的规则进行分裂和删除，并将它们与神经释义模型结合在一起，以产生各种的重写样式。我们引入了一种新的数据增强方法，以提高模型的释义能力。通过自动和手动评估，我们表明我们所提出的模型为任务建立了一个新的最先进，比现有系统更频繁地释义，并且可以控制应用于输入文本的每个简化操作的程度。

Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting. Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously. However, such systems limit themselves to mostly deleting words and cannot easily adapt to the requirements of different target audiences. In this paper, we propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles. We introduce a new data augmentation method to improve the paraphrasing capability of our model. Through automatic and manual evaluations, we show that our proposed model establishes a new state-of-the-art for the task, paraphrasing more often than the existing systems, and can control the degree of each simplification operation applied to the input texts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题