中文模型的内在知识评估

论文标题

中文模型的内在知识评估

Intrinsic Knowledge Evaluation on Chinese Language Models

论文作者

Wang, Zhiruo, Hu, Renfen

论文摘要

最近的NLP任务从预训练的语言模型（LM）中受益匪浅，因为它们能够编码各个方面的知识。但是，当前的LM评估集中在下游性能上，因此缺乏全面检查它们在哪个方面和在何种程度上编码知识。本文通过提出有关句法，语义，常识和事实知识的四个任务来解决这两个查询，总共汇总了39,308美元的问题，涵盖了中文的语言和世界知识。在整个实验中，我们的探针和知识数据被证明是评估预训练的中国LMS的可靠基准。我们的工作可在https://github.com/zhiruowang/chneval上公开获得。

Recent NLP tasks have benefited a lot from pre-trained language models (LM) since they are able to encode knowledge of various aspects. However, current LM evaluations focus on downstream performance, hence lack to comprehensively inspect in which aspect and to what extent have they encoded knowledge. This paper addresses both queries by proposing four tasks on syntactic, semantic, commonsense, and factual knowledge, aggregating to a total of $39,308$ questions covering both linguistic and world knowledge in Chinese. Throughout experiments, our probes and knowledge data prove to be a reliable benchmark for evaluating pre-trained Chinese LMs. Our work is publicly available at https://github.com/ZhiruoWang/ChnEval.

下载PDF全文

下载文献需遵守相关版权规定

论文标题