基于双重过渡损失的嘈杂标签的关系提取的实用框架

论文标题

基于双重过渡损失的嘈杂标签的关系提取的实用框架

A Practical Framework for Relation Extraction with Noisy Labels Based on Doubly Transitional Loss

论文作者

Wu, Shanchan, Fan, Kai

论文摘要

人类注释或基于规则的自动标记是增强数据提取数据的有效方法。但是，不可避免的错误标签问题例如，遥远的监督可能会恶化许多现有方法的性能。为了解决这个问题，我们介绍了一个实用的端到端深度学习框架，包括标准功能提取器和一个新颖的嘈杂分类器，并使用我们提出的双重过渡机制。一个过渡基本上是通过隐式代表真实和嘈杂标签之间转换的隐藏层之间的非线性转换的参数化，并且可以轻松地将其与其他模型参数一起优化。另一个是一个明确的概率过渡矩阵，该矩阵捕获标签之间的直接转换，但需要从EM算法派生。我们在NYT数据集和Semeval 2018任务7上进行实验7。经验结果表明，与最先进的方法相当或更好。

Either human annotation or rule based automatic labeling is an effective method to augment data for relation extraction. However, the inevitable wrong labeling problem for example by distant supervision may deteriorate the performance of many existing methods. To address this issue, we introduce a practical end-to-end deep learning framework, including a standard feature extractor and a novel noisy classifier with our proposed doubly transitional mechanism. One transition is basically parameterized by a non-linear transformation between hidden layers that implicitly represents the conversion between the true and noisy labels, and it can be readily optimized together with other model parameters. Another is an explicit probability transition matrix that captures the direct conversion between labels but needs to be derived from an EM algorithm. We conduct experiments on the NYT dataset and SemEval 2018 Task 7. The empirical results show comparable or better performance over state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题