论文标题
不只是纯文本!燃料文档级的关系提取,并具有明确的语法改进和潜在建模
Not Just Plain Text! Fuel Document-Level Relation Extraction with Explicit Syntax Refinement and Subsentence Modeling
论文作者
论文摘要
文档级关系提取(DOCRE)旨在识别单个文档中实体之间的语义标签。 DOCRE的一个主要挑战是从长文中挖掘有关特定实体对的决定性细节。但是,在许多情况下,即使在手动标记的支持证据中,也只有一小部分文本包含所需的信息。为了更好地捕获和利用指导性信息,我们提出了一种新颖的显式语法改进和基于潜在建模的框架(Larson)。通过引入额外的句法信息,Larson可以模拟任意粒度性的子项并有效地筛选启发性。此外,我们将精致的语法纳入文本表示中,进一步提高了拉尔森的性能。三个基准数据集(DOCRED,CDR和GDA)的实验结果表明,Larson显着胜过现有方法。
Document-level relation extraction (DocRE) aims to identify semantic labels among entities within a single document. One major challenge of DocRE is to dig decisive details regarding a specific entity pair from long text. However, in many cases, only a fraction of text carries required information, even in the manually labeled supporting evidence. To better capture and exploit instructive information, we propose a novel expLicit syntAx Refinement and Subsentence mOdeliNg based framework (LARSON). By introducing extra syntactic information, LARSON can model subsentences of arbitrary granularity and efficiently screen instructive ones. Moreover, we incorporate refined syntax into text representations which further improves the performance of LARSON. Experimental results on three benchmark datasets (DocRED, CDR, and GDA) demonstrate that LARSON significantly outperforms existing methods.