Semeval-2022任务11：一种基于知识的多语言命名实体识别的知识系统

论文标题

Semeval-2022任务11：一种基于知识的多语言命名实体识别的知识系统

DAMO-NLP at SemEval-2022 Task 11: A Knowledge-based System for Multilingual Named Entity Recognition

论文作者

Wang, Xinyu, Shen, Yongliang, Cai, Jiong, Wang, Tao, Wang, Xiaobin, Xie, Pengjun, Huang, Fei, Lu, Weiming, Zhuang, Yueting, Tu, Kewei, Lu, Wei, Jiang, Yong

论文摘要

多功能人共享的任务旨在检测在多种语言的简短和低文本设置中，在语义上模棱两可且复杂的命名实体。缺乏上下文使人们认识到含糊的命名实体具有挑战性。为了减轻此问题，我们的团队Damo-NLP提出了一个基于知识的系统，在该系统中，我们建立了一个基于Wikipedia的多语言知识基础，以向指定的实体识别（NER）模型提供相关的上下文信息。给定输入句子，我们的系统有效地从知识库中检索了相关上下文。然后，将原始输入句子加上此类上下文信息，从而可以捕获明显更好的上下文化令牌表示。我们的系统在Multiconer共享任务中赢得了13个曲目中的10个。

The MultiCoNER shared task aims at detecting semantically ambiguous and complex named entities in short and low-context settings for multiple languages. The lack of contexts makes the recognition of ambiguous named entities challenging. To alleviate this issue, our team DAMO-NLP proposes a knowledge-based system, where we build a multilingual knowledge base based on Wikipedia to provide related context information to the named entity recognition (NER) model. Given an input sentence, our system effectively retrieves related contexts from the knowledge base. The original input sentences are then augmented with such context information, allowing significantly better contextualized token representations to be captured. Our system wins 10 out of 13 tracks in the MultiCoNER shared task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题