论文标题

知识渊博的显着跨度面具,以增强语言模型作为知识库

Knowledgeable Salient Span Mask for Enhancing Language Models as Knowledge Base

论文作者

Wang, Cunxiang, Luo, Fuli, Li, Yanyang, Xu, Runxin, Huang, Fei, Zhang, Yue

论文摘要

像Bert这样的预训练的语言模型(PLM)在各种下游NLP任务中取得了重大进展。但是,通过要求模型进行固定风格的测试,最近的工作发现,PLM在从非结构化文本中获取知识方面很短。为了理解PLM在检索知识时的内部行为,我们首先定义知识奖励(K-B)令牌和无知识(K-F)代币的非结构化文本,并要求专业注释者手动标记一些样本。然后,我们发现PLM更有可能对K-B代币进行错误的预测,而对自我发项式模块中的令牌进行了更少的关注。基于这些观察结果,我们开发了两种解决方案,以完全自欺欺人的方式从非结构化的文本中学习更多知识。知识密集型任务的实验显示了提出的方法的有效性。据我们最大的知识,我们是第一个探索在不断培训中对知识学习的完全自我监督的人。

Pre-trained language models (PLMs) like BERT have made significant progress in various downstream NLP tasks. However, by asking models to do cloze-style tests, recent work finds that PLMs are short in acquiring knowledge from unstructured text. To understand the internal behaviour of PLMs in retrieving knowledge, we first define knowledge-baring (K-B) tokens and knowledge-free (K-F) tokens for unstructured text and ask professional annotators to label some samples manually. Then, we find that PLMs are more likely to give wrong predictions on K-B tokens and attend less attention to those tokens inside the self-attention module. Based on these observations, we develop two solutions to help the model learn more knowledge from unstructured text in a fully self-supervised manner. Experiments on knowledge-intensive tasks show the effectiveness of the proposed methods. To our best knowledge, we are the first to explore fully self-supervised learning of knowledge in continual pre-training.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源