语言模型作为知识基础：关于实体表示，存储容量和释义查询

论文标题

语言模型作为知识基础：关于实体表示，存储容量和释义查询

Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries

论文作者

Heinzerling, Benjamin, Inui, Kentaro

论文摘要

验证的语言模型已被认为是对结构化知识库的一种可能的替代或补充。但是，到目前为止，这种新兴的LM-AS-KB范式仅在非常有限的环境中被考虑，该设置仅允许处理21K实体，其单身名称在通用的LM词汇中发现。此外，到目前为止，这种范式的主要好处，即使用各种自然语言释义来查询KB。在这里，我们为将LMS视为KB的两个基本要求：（i）能够存储涉及大量实体的大量事实以及（ii）查询存储的事实的能力。我们探索了三个实体表示，允许LMS代表数百万个实体，并介绍了有关LMS中世界知识的释义查询的详细案例研究，从而提供了一种概念概念，即语言模型确实可以作为知识基础。

Pretrained language models have been suggested as a possible alternative or complement to structured knowledge bases. However, this emerging LM-as-KB paradigm has so far only been considered in a very limited setting, which only allows handling 21k entities whose single-token name is found in common LM vocabularies. Furthermore, the main benefit of this paradigm, namely querying the KB using a variety of natural language paraphrases, is underexplored so far. Here, we formulate two basic requirements for treating LMs as KBs: (i) the ability to store a large number facts involving a large number of entities and (ii) the ability to query stored facts. We explore three entity representations that allow LMs to represent millions of entities and present a detailed case study on paraphrased querying of world knowledge in LMs, thereby providing a proof-of-concept that language models can indeed serve as knowledge bases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题