论文标题

部分可观测时空混沌系统的无模型预测

Improving Chinese Named Entity Recognition by Search Engine Augmentation

论文作者

Mao, Qinghua, Li, Jiatong, Meng, Kui

论文摘要

与英语相比,中文患有更语义上的歧义,例如模糊的单词边界和多义单词。在这种情况下,上下文信息不足以支持中国命名实体识别(NER),特别是对于稀有和新兴的实体。使用外部知识的语义扩展是减轻此问题的潜在方法,而如何获得和利用外部知识来实现​​NER任务仍然是一个挑战。在本文中,我们提出了一种基于神经的方法来使用中文搜索引擎的外部知识来执行语义增强。特别是,采用了多通道语义融合模型来生成增强输入表示形式,该模型汇总了从搜索引擎检索到的外部相关文本。实验显示了我们模型在4个NER数据集中的优越性,包括正式和社交媒体语言环境,这进一步证明了我们方法的有效性。

Compared with English, Chinese suffers from more grammatical ambiguities, like fuzzy word boundaries and polysemous words. In this case, contextual information is not sufficient to support Chinese named entity recognition (NER), especially for rare and emerging named entities. Semantic augmentation using external knowledge is a potential way to alleviate this problem, while how to obtain and leverage external knowledge for the NER task remains a challenge. In this paper, we propose a neural-based approach to perform semantic augmentation using external knowledge from search engine for Chinese NER. In particular, a multi-channel semantic fusion model is adopted to generate the augmented input representations, which aggregates external related texts retrieved from the search engine. Experiments have shown the superiority of our model across 4 NER datasets, including formal and social media language contexts, which further prove the effectiveness of our approach.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源