Klearn：摘要数据的背景知识推断

论文标题

Klearn：摘要数据的背景知识推断

KLearn: Background Knowledge Inference from Summarization Data

论文作者

Peyrard, Maxime, West, Robert

论文摘要

文本摘要的目的是将文档压缩为相关信息，同时排除接收者已经知道的背景信息。到目前为止，摘要研究人员对相关性的关注比对背景知识的关注更多。相比之下，这项工作将背景知识置于前景。基于意识到人类摘要和注释者所做的选择包含有关其背景知识的隐式信息，我们开发和比较了从摘要数据中推断背景知识的技术。基于此框架，我们定义了明确对背景知识进行建模的摘要评分函数，并表明这些评分功能适合人类判断比基线要好得多。我们说明了我们框架的许多潜在应用中的一些。首先，我们提供有关人类信息重要性先验的见解。其次，我们证明，平均多个有偏见的注释者或语料库的背景知识大大提高了得分绩效。最后，我们讨论了我们框架的潜在应用，而不是摘要。

The goal of text summarization is to compress documents to the relevant information while excluding background information already known to the receiver. So far, summarization researchers have given considerably more attention to relevance than to background knowledge. In contrast, this work puts background knowledge in the foreground. Building on the realization that the choices made by human summarizers and annotators contain implicit information about their background knowledge, we develop and compare techniques for inferring background knowledge from summarization data. Based on this framework, we define summary scoring functions that explicitly model background knowledge, and show that these scoring functions fit human judgments significantly better than baselines. We illustrate some of the many potential applications of our framework. First, we provide insights into human information importance priors. Second, we demonstrate that averaging the background knowledge of multiple, potentially biased annotators or corpora greatly improves summary-scoring performance. Finally, we discuss potential applications of our framework beyond summarization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题