论文标题
Summareranker:多任务混合物的混合物重新排列框架用于抽象摘要
SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization
论文作者
论文摘要
序列到序列神经网络最近在抽象性摘要方面取得了巨大成功,尤其是通过在下游数据集上微调大型预训练的预训练的语言模型。这些模型通常通过梁搜索解码,以生成独特的摘要。但是,搜索空间非常大,并且由于暴露偏见,这种解码不是最佳的。在本文中,我们表明可以直接训练在一组摘要候选人上重新排行的第二阶段模型。我们的Experts混合物Summareranker学会了选择一个更好的候选者,并始终提高基本模型的性能。借助基本飞马,我们在CNN-Dailmail(47.16 Rouge-1)上将胭脂得分提高了5.44%,XSUM(48.12 Rouge-1)的得分为1.31%,Reddit Tifu(29.83 Rouge-1)的分数为9.34%,达到了新的状态。我们的代码和检查点将在https://github.com/ntunlp/summareranker上找到。
Sequence-to-sequence neural networks have recently achieved great success in abstractive summarization, especially through fine-tuning large pre-trained language models on the downstream dataset. These models are typically decoded with beam search to generate a unique summary. However, the search space is very large, and with the exposure bias, such decoding is not optimal. In this paper, we show that it is possible to directly train a second-stage model performing re-ranking on a set of summary candidates. Our mixture-of-experts SummaReranker learns to select a better candidate and consistently improves the performance of the base model. With a base PEGASUS, we push ROUGE scores by 5.44% on CNN-DailyMail (47.16 ROUGE-1), 1.31% on XSum (48.12 ROUGE-1) and 9.34% on Reddit TIFU (29.83 ROUGE-1), reaching a new state-of-the-art. Our code and checkpoints will be available at https://github.com/ntunlp/SummaReranker.