一项有关知识增强的预训练的语言模型的调查

论文标题

一项有关知识增强的预训练的语言模型的调查

A Survey on Knowledge-Enhanced Pre-trained Language Models

论文作者

Zhen, Chaoqi, Shang, Yanlei, Liu, Xiangyu, Li, Yifei, Chen, Yong, Zhang, Dell

论文摘要

自然语言处理（NLP）已通过使用预训练的语言模型（PLM）（例如BERT）进行了革新。尽管在几乎每个NLP任务中都设定了新记录，但PLM仍然面临许多挑战，包括可解释性差，推理能力较弱以及应用于下游任务时需要大量昂贵的注释数据。通过将外部知识集成到PLM中，\ textIt {\下划线{k} nowledge- \下划线{在本文中，我们通过一系列研究系统地检查了Keplms。具体而言，我们概述了要集成到Keplms的常见类型和不同的知识格式，详细介绍了构建和评估Keplms的现有方法，介绍Keplms在下游任务中的应用，并讨论未来的研究指示。研究人员将通过对该领域的最新发展进行快速，全面的概述，从这项调查中受益。

Natural Language Processing (NLP) has been revolutionized by the use of Pre-trained Language Models (PLMs) such as BERT. Despite setting new records in nearly every NLP task, PLMs still face a number of challenges including poor interpretability, weak reasoning capability, and the need for a lot of expensive annotated data when applied to downstream tasks. By integrating external knowledge into PLMs, \textit{\underline{K}nowledge-\underline{E}nhanced \underline{P}re-trained \underline{L}anguage \underline{M}odels} (KEPLMs) have the potential to overcome the above-mentioned limitations. In this paper, we examine KEPLMs systematically through a series of studies. Specifically, we outline the common types and different formats of knowledge to be integrated into KEPLMs, detail the existing methods for building and evaluating KEPLMS, present the applications of KEPLMs in downstream tasks, and discuss the future research directions. Researchers will benefit from this survey by gaining a quick and comprehensive overview of the latest developments in this field.

下载PDF全文

下载文献需遵守相关版权规定

论文标题