扩散器：通过基于编辑的重建的离散扩散

论文标题

扩散器：通过基于编辑的重建的离散扩散

DiffusER: Discrete Diffusion via Edit-based Reconstruction

论文作者

Reid, Machel, Hellendoorn, Vincent J., Neubig, Graham

论文摘要

在文本生成中，当前一次从头开始一个令牌生成文本的模型当前是主要的范式。尽管表现出色，但这些模型缺乏修改现有文本的能力，这在许多实际情况下限制了它们的可用性。我们希望通过扩散器（通过基于编辑的重建扩散的扩散）来解决这一问题，这是一种基于Denoising扩散模型的新的基于编辑的生成模型，该模型是一种使用Markov denoising步骤的Markov链来逐步生成数据的模型。扩散器一般不仅是一个强大的生成模型，在跨越机器翻译，摘要和样式传输的几个任务上具有抗衡的自回归模型。它还可以执行其他各种一代，标准自回归模型不适合使用。例如，我们证明了扩散器使用户可以在原型或不完整的序列上生成生成，并继续根据先前的编辑步骤进行修改。

In text generation, models that generate text from scratch one token at a time are currently the dominant paradigm. Despite being performant, these models lack the ability to revise existing text, which limits their usability in many practical scenarios. We look to address this, with DiffusER (Diffusion via Edit-based Reconstruction), a new edit-based generative model for text based on denoising diffusion models -- a class of models that use a Markov chain of denoising steps to incrementally generate data. DiffusER is not only a strong generative model in general, rivalling autoregressive models on several tasks spanning machine translation, summarization, and style transfer; it can also perform other varieties of generation that standard autoregressive models are not well-suited for. For instance, we demonstrate that DiffusER makes it possible for a user to condition generation on a prototype, or an incomplete sequence, and continue revising based on previous edit steps.

下载PDF全文

下载文献需遵守相关版权规定

论文标题