论文标题
EDIT5:用T5温暖启动的半自动回归文本编辑
EdiT5: Semi-Autoregressive Text-Editing with T5 Warm-Start
论文作者
论文摘要
我们提出Edit5-一种新型的半自动回调文本编辑模型,旨在结合非自动回调文本编辑和自动回归解码的优势。在推理期间,Edit5比传统序列到序列(SEQ2SEQ)模型更快,同时能够对灵活的输入输出转换进行建模。 这是通过将生成过程分解为三个子任务来实现的:(1)标记以决定要保留在输出中的输入令牌的子集,(2)重新排序以在输出文本中定义其顺序,(3)插入以填充输入中不存在的缺失令牌。负责生成输出最大部分的标签和重新排序步骤是非解放性的,而插入步骤则使用自回归解码器。 根据任务,Edit5平均需要更少的自回归步骤,与SEQ2SEQ模型相比,相比之下,高达25倍的加速度。从质量上讲,在对三个NLG任务进行评估时,使用预训练的T5检查站初始化了Edit5,在高资源设置中,在高资源设置中的性能可比性相当:句子融合,语法误差校正和去上下文化,同时在低资源设置中明显优于T5。
We present EdiT5 - a novel semi-autoregressive text-editing model designed to combine the strengths of non-autoregressive text-editing and autoregressive decoding. EdiT5 is faster during inference than conventional sequence-to-sequence (seq2seq) models, while being capable of modelling flexible input-output transformations. This is achieved by decomposing the generation process into three sub-tasks: (1) tagging to decide on the subset of input tokens to be preserved in the output, (2) re-ordering to define their order in the output text, and (3) insertion to infill the missing tokens that are not present in the input. The tagging and re-ordering steps, which are responsible for generating the largest portion of the output, are non-autoregressive, while the insertion step uses an autoregressive decoder. Depending on the task, EdiT5 on average requires significantly fewer autoregressive steps, demonstrating speedups of up to 25x when compared to seq2seq models. Quality-wise, EdiT5 is initialized with a pre-trained T5 checkpoint yielding comparable performance to T5 in high-resource settings when evaluated on three NLG tasks: Sentence Fusion, Grammatical Error Correction, and Decontextualization while clearly outperforming T5 in low-resource settings.