RENCGEN：通过大型排名改进文本生成

论文标题

RENCGEN：通过大型排名改进文本生成

RankGen: Improving Text Generation with Large Ranking Models

论文作者

Krishna, Kalpesh, Chang, Yapei, Wieting, John, Iyyer, Mohit

论文摘要

给定输入序列（或前缀），现代语言模型通常会为输出序列分配较高的概率，以重复，不连贯或与前缀无关。因此，模型生成的文本还包含此类文物。为了解决这些问题，我们提出了rankgen，这是英语的1.2b参数编码器模型，它在给定前缀的情况下为模型世代分数。 RankGen可以在梁搜索中灵活地合并为评分函数，并用于从任何审慎的语言模型中解码。我们使用大规模的对比度学习训练RankGen，以绘制一个前缀接近后面的前缀，距离它随后的基序序列，并且远离两种类型的负面序列：（1）与前缀同一文档的随机序列，（2）来自前缀上的大型语言模型产生的序列。 Experiments across four different language models (345M-11B parameters) and two domains show that RankGen significantly outperforms decoding algorithms like nucleus, top-k, and typical sampling, as well as contrastive decoding and search, on both automatic metrics (85.0 vs 77.3 MAUVE over nucleus) as well as human evaluations with English writers (74.5% human preference over核采样）。分析表明，与基准相比，RENCGEN输出与前缀更相关，并提高连续性和连贯性。我们使用解释来释放模型检查点，代码和人类偏好数据，以促进未来的研究。

Given an input sequence (or prefix), modern language models often assign high probabilities to output sequences that are repetitive, incoherent, or irrelevant to the prefix; as such, model-generated text also contains such artifacts. To address these issues we present RankGen, a 1.2B parameter encoder model for English that scores model generations given a prefix. RankGen can be flexibly incorporated as a scoring function in beam search and used to decode from any pretrained language model. We train RankGen using large-scale contrastive learning to map a prefix close to the ground-truth sequence that follows it and far away from two types of negatives: (1) random sequences from the same document as the prefix, and (2) sequences generated from a large language model conditioned on the prefix. Experiments across four different language models (345M-11B parameters) and two domains show that RankGen significantly outperforms decoding algorithms like nucleus, top-k, and typical sampling, as well as contrastive decoding and search, on both automatic metrics (85.0 vs 77.3 MAUVE over nucleus) as well as human evaluations with English writers (74.5% human preference over nucleus sampling). Analysis reveals that RankGen outputs are more relevant to the prefix and improve continuity and coherence compared to baselines. We release our model checkpoints, code, and human preference data with explanations to facilitate future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题