论文标题

KALM:本地,文档和全球环境的知识感知整合,以备长期文档理解

KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding

论文作者

Feng, Shangbin, Tan, Zhaoxuan, Zhang, Wenqian, Lei, Zhenyu, Tsvetkov, Yulia

论文摘要

随着审慎的语言模型(LMS)的出现,越来越多的研究工作一直集中在注入常识性和特定于领域的知识上,以准备LMS的下游任务。这些作品试图利用知识图,事实上的符号知识表示标准以及预验证的LMS。尽管现有的方法利用了外部知识,但仍然是一个悬而未决的问题,如何共同合并代表不同上下文的知识图,从本地(例如句子),文档级别到全球知识,以在这些环境中启用知识丰富的交流。这种丰富的上下文化对长期文档理解任务特别有益,因为标准预估计的LMS通常由输入序列长度界定。鉴于这些挑战,我们提出了KALM,这是一种知识吸引语言模型,该模型共同利用本地,文档级别和全球环境中的知识,以便长期的文档理解。 KALM首先将长文档和知识图编码为三个知识感知的上下文表示。然后,它使用特定于上下文的层处理每个上下文,然后是上下文融合层,该层融合层有助于知识交换以得出总体文档表示。广泛的实验表明,KALM在六个长文档理解任务和数据集上实现了最新的性能。进一步的分析表明,三个知识感知的环境是互补的,它们都有助于模型性能,而不同上下文的重要性和信息交换模式在不同的任务和数据集方面有所不同。

With the advent of pretrained language models (LMs), increasing research efforts have been focusing on infusing commonsense and domain-specific knowledge to prepare LMs for downstream tasks. These works attempt to leverage knowledge graphs, the de facto standard of symbolic knowledge representation, along with pretrained LMs. While existing approaches have leveraged external knowledge, it remains an open question how to jointly incorporate knowledge graphs representing varying contexts, from local (e.g., sentence), to document-level, to global knowledge, to enable knowledge-rich exchange across these contexts. Such rich contextualization can be especially beneficial for long document understanding tasks since standard pretrained LMs are typically bounded by the input sequence length. In light of these challenges, we propose KALM, a Knowledge-Aware Language Model that jointly leverages knowledge in local, document-level, and global contexts for long document understanding. KALM first encodes long documents and knowledge graphs into the three knowledge-aware context representations. It then processes each context with context-specific layers, followed by a context fusion layer that facilitates knowledge exchange to derive an overarching document representation. Extensive experiments demonstrate that KALM achieves state-of-the-art performance on six long document understanding tasks and datasets. Further analyses reveal that the three knowledge-aware contexts are complementary and they all contribute to model performance, while the importance and information exchange patterns of different contexts vary with respect to different tasks and datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源