论文标题
我们在文本摘要上取得了什么成就?
What Have We Achieved on Text Summarization?
论文作者
论文摘要
深度学习已通过研究多年来所报告的各种方法和改进的胭脂分数,从而显着改善了文本摘要。但是,自动摘要者和人类专业人员产生的摘要之间仍然存在差距。为了在精细的句法和语义水平上对汇总系统的优势和限制有更多的了解,我们咨询了多维质量指标(MQM),并在10个代表性摘要模型上量化了8个主要错误来源。首先,我们发现1)在类似的环境下,由于忠实和事实矛盾的力量,提取性摘要通常比其抽象性更好。 2)里程碑技术,例如复制,覆盖范围和混合提取/抽象方法,确实带来了特定的改进,但也表现出局限性; 3)预训练技术,尤其是序列到序列预训练,对于改善文本摘要非常有效,而BART给出了最佳结果。
Deep learning has led to significant improvement in text summarization with various methods investigated and improved ROUGE scores reported over the years. However, gaps still exist between summaries produced by automatic summarizers and human professionals. Aiming to gain more understanding of summarization systems with respect to their strengths and limits on a fine-grained syntactic and semantic level, we consult the Multidimensional Quality Metric(MQM) and quantify 8 major sources of errors on 10 representative summarization models manually. Primarily, we find that 1) under similar settings, extractive summarizers are in general better than their abstractive counterparts thanks to strength in faithfulness and factual-consistency; 2) milestone techniques such as copy, coverage and hybrid extractive/abstractive methods do bring specific improvements but also demonstrate limitations; 3) pre-training techniques, and in particular sequence-to-sequence pre-training, are highly effective for improving text summarization, with BART giving the best results.