论文标题
提出反复:用于嵌套命名实体识别的两个阶段集预测网络
Propose-and-Refine: A Two-Stage Set Prediction Network for Nested Named Entity Recognition
论文作者
论文摘要
嵌套命名的实体识别(Nested Ner)是自然语言处理中的基本任务。已经提出了各种基于跨度的方法来检测具有跨度表示的嵌套实体。但是,基于跨度的方法不考虑跨度与其他实体或短语之间的关系,这对NER任务很有帮助。此外,由于跨度枚举长度有限,基于跨度的方法在预测长实体方面难以预测。为了减轻这些问题,我们介绍了提出的和refine网络(PNRNET),这是一个针对Nested NER的两阶段集预测网络。在建议阶段,我们使用基于跨度的预测指标来生成一些粗糙的实体预测作为实体建议。在精炼阶段,建议相互互动,并且将更丰富的上下文信息纳入了建议表示。精致的建议表示形式用于重新预测实体边界和类。这样,可以消除粗略建议中的错误,并且边界预测不再受到跨度枚举长度限制的约束。此外,我们构建了多尺度句子表示,它可以更好地对句子的层次结构进行建模,并提供比令牌级表示更丰富的上下文信息。实验表明,PNRNET在四个嵌套的NER数据集和一个平面NER数据集上实现了最先进的性能。
Nested named entity recognition (nested NER) is a fundamental task in natural language processing. Various span-based methods have been proposed to detect nested entities with span representations. However, span-based methods do not consider the relationship between a span and other entities or phrases, which is helpful in the NER task. Besides, span-based methods have trouble predicting long entities due to limited span enumeration length. To mitigate these issues, we present the Propose-and-Refine Network (PnRNet), a two-stage set prediction network for nested NER. In the propose stage, we use a span-based predictor to generate some coarse entity predictions as entity proposals. In the refine stage, proposals interact with each other, and richer contextual information is incorporated into the proposal representations. The refined proposal representations are used to re-predict entity boundaries and classes. In this way, errors in coarse proposals can be eliminated, and the boundary prediction is no longer constrained by the span enumeration length limitation. Additionally, we build multi-scale sentence representations, which better model the hierarchical structure of sentences and provide richer contextual information than token-level representations. Experiments show that PnRNet achieves state-of-the-art performance on four nested NER datasets and one flat NER dataset.