论文标题
文档级别的关系提取带有自适应阈值和局部上下文池
Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling
论文作者
论文摘要
文档级关系提取(RE)与其句子级别对应物相比提出了新的挑战。一个文档通常包含多个实体对,一个实体对在与多个可能关系相关的文档中多次发生。在本文中,我们提出了两种新颖的技术,即自适应阈值和局部上下文汇总,以解决多标签和多实体问题。自适应阈值取代了与可学习的实体有关的阈值在先前工作中多标签分类的全局阈值。本地化上下文汇总直接将注意力转移到预训练的语言模型中,以找到可用于决定关系的相关上下文。我们在生物医学域中的三个文档级RE基准数据集进行了三个文档级RE基准数据集:DOCRED,一个最近发布的大规模RE数据集和两个数据集CDRAND GDA。我们的ATLOP(自适应阈值和局部上下文池)模型的F1得分为63.4,并且在CDR和GDA上的表现也明显优于现有模型。
Document-level relation extraction (RE) poses new challenges compared to its sentence-level counterpart. One document commonly contains multiple entity pairs, and one entity pair occurs multiple times in the document associated with multiple possible relations. In this paper, we propose two novel techniques, adaptive thresholding and localized context pooling, to solve the multi-label and multi-entity problems. The adaptive thresholding replaces the global threshold for multi-label classification in the prior work with a learnable entities-dependent threshold. The localized context pooling directly transfers attention from pre-trained language models to locate relevant context that is useful to decide the relation. We experiment on three document-level RE benchmark datasets: DocRED, a recently released large-scale RE dataset, and two datasets CDRand GDA in the biomedical domain. Our ATLOP (Adaptive Thresholding and Localized cOntext Pooling) model achieves an F1 score of 63.4, and also significantly outperforms existing models on both CDR and GDA.