基于精制表示形式，细粒度命名实体在远处有监督的数据上键入

论文标题

基于精制表示形式，细粒度命名实体在远处有监督的数据上键入

Fine-Grained Named Entity Typing over Distantly Supervised Data Based on Refined Representations

论文作者

Ali, Muhammad Asif, Sun, Yifang, Li, Bing, Wang, Wei

论文摘要

细粒命名实体键入（FG-NET）是自然语言处理（NLP）的关键组成部分。它旨在将实体提及为各种实体类型。由于大量实体类型，远处的监督被用于为此任务收集培训数据，该任务将类型标签分配给实体，而与上下文无关。为了减轻嘈杂的标签，FGNET上的现有方法分析了实体完全相互独立，并仅基于提及的特定句子上下文分配类型标签。对于高度重叠和嘈杂类型的标签而言，这是不足的，因为它阻碍了句子边界传递的信息。为此，我们提出了一个边缘加权的专注图卷积网络，该网络通过在最终分类之前掌握语料库级的上下文线索来完善嘈杂的提及表示形式。实验评估表明，所提出的模型的相对评分分别优于现有研究，分别为宏F1和Micro F1的相对评分分别高达10.2％和8.3％。

Fine-Grained Named Entity Typing (FG-NET) is a key component in Natural Language Processing (NLP). It aims at classifying an entity mention into a wide range of entity types. Due to a large number of entity types, distant supervision is used to collect training data for this task, which noisily assigns type labels to entity mentions irrespective of the context. In order to alleviate the noisy labels, existing approaches on FGNET analyze the entity mentions entirely independent of each other and assign type labels solely based on mention sentence-specific context. This is inadequate for highly overlapping and noisy type labels as it hinders information passing across sentence boundaries. For this, we propose an edge-weighted attentive graph convolution network that refines the noisy mention representations by attending over corpus-level contextual clues prior to the end classification. Experimental evaluation shows that the proposed model outperforms the existing research by a relative score of upto 10.2% and 8.3% for macro f1 and micro f1 respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题