弥合培训和推断贝叶斯可控语言模型之间的差距

论文标题

弥合培训和推断贝叶斯可控语言模型之间的差距

Bridging the Gap Between Training and Inference of Bayesian Controllable Language Models

论文作者

Liu, Han, Wang, Bingning, Yao, Ting, Liang, Haijin, Xu, Jianjin, Hu, Xiaolin

论文摘要

大规模的预训练的语言模型在自然语言生成任务上取得了巨大的成功。但是，很难控制预训练的语言模型来生成具有所需属性的句子，例如主题和情感等。最近，贝叶斯可控语言模型（BCLM）已被证明在可控语言生成中有效。 BCLM并不使用外部歧视器来指导预训练的语言模型的生成，而不是微调预训练的语言模型的参数。但是，BCLMS训练与推断之间的不匹配限制了模型的性能。为了解决这个问题，在这项工作中，我们提出了一个可控语言生成的“双子座歧视者”，以减轻小计算成本的不匹配问题。我们在两个可控的语言生成任务上测试了我们的方法：情感控制和主题控制。在这两个任务上，我们的方法都达到了新的最新最新结果，从而可以自动评估。

Large-scale pre-trained language models have achieved great success on natural language generation tasks. However, it is difficult to control the pre-trained language models to generate sentences with the desired attribute such as topic and sentiment, etc. Recently, Bayesian Controllable Language Models (BCLMs) have been shown to be efficient in controllable language generation. Rather than fine-tuning the parameters of pre-trained language models, BCLMs use external discriminators to guide the generation of pre-trained language models. However, the mismatch between training and inference of BCLMs limits the performance of the models. To address the problem, in this work we propose a "Gemini Discriminator" for controllable language generation which alleviates the mismatch problem with a small computational cost. We tested our method on two controllable language generation tasks: sentiment control and topic control. On both tasks, our method reached achieved new state-of-the-art results in automatic and human evaluations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题