论文标题
脚趾:通过嵌入标签/单词关系和更细粒度的标签增强的不连续的NER模型
TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding Tag/Word Relations and More Fine-Grained Tags
论文作者
论文摘要
到目前为止,不连续的命名实体识别(NER)已受到了越来越多的研究注意力,许多相关方法(例如基于超图的方法,基于跨度的方法),基于跨度的方法和序列到序列(SEQ2SEQ)方法等。但是,这些方法或更少或更少的问题遭受了某些问题,例如解码歧义和效率,限制了其性能,限制了其性能。最近,从标记系统和模型体系结构的灵活设计中受益的网格标记方法已显示出优势,可以适应各种信息提取任务。在本文中,我们遵循此类方法的行,并为不连续的NER提出了一个竞争性网格模型。我们之所以将模型脚趾称为,是因为我们将两种面向标签的增强机制纳入了最先进的(SOTA)网格标签模型,将NER问题置于单词之间的关系预测中。首先,我们设计了一个嵌入模块(TREM)的标签表示,以迫使我们的模型不仅要考虑单词关系关系,还要考虑文字标签和标签标签关系。具体而言,我们构建标签表示并将其嵌入TREM,以便TREM可以将标签和单词表示视为查询/键/值,并利用自我注意力来建模其关系。另一方面,在SOTA型号的下一个邻居孔字(NNW)和尾字 - 尾字(THW)标签中,我们添加了两个新的对称标签,即先前的尼克博林字(PNW)(pnw)和头尾尾(htw)(htw),以模拟更良好的单词 - 单词之间的关系和替代标签的propictional prectication prectication tag prectication。在三个基准数据集的实验中,即CADEC,share13和share14,我们的脚趾模型在F1中将SOTA结果提高了约0.83%,0.05%和0.66%,这表明了其有效性。
So far, discontinuous named entity recognition (NER) has received increasing research attention and many related methods have surged such as hypergraph-based methods, span-based methods, and sequence-to-sequence (Seq2Seq) methods, etc. However, these methods more or less suffer from some problems such as decoding ambiguity and efficiency, which limit their performance. Recently, grid-tagging methods, which benefit from the flexible design of tagging systems and model architectures, have shown superiority to adapt for various information extraction tasks. In this paper, we follow the line of such methods and propose a competitive grid-tagging model for discontinuous NER. We call our model TOE because we incorporate two kinds of Tag-Oriented Enhancement mechanisms into a state-of-the-art (SOTA) grid-tagging model that casts the NER problem into word-word relationship prediction. First, we design a Tag Representation Embedding Module (TREM) to force our model to consider not only word-word relationships but also word-tag and tag-tag relationships. Concretely, we construct tag representations and embed them into TREM, so that TREM can treat tag and word representations as queries/keys/values and utilize self-attention to model their relationships. On the other hand, motivated by the Next-Neighboring-Word (NNW) and Tail-Head-Word (THW) tags in the SOTA model, we add two new symmetric tags, namely Previous-Neighboring-Word (PNW) and Head-Tail-Word (HTW), to model more fine-grained word-word relationships and alleviate error propagation from tag prediction. In the experiments of three benchmark datasets, namely CADEC, ShARe13 and ShARe14, our TOE model pushes the SOTA results by about 0.83%, 0.05% and 0.66% in F1, demonstrating its effectiveness.