论文标题

具有常识性约束的概率模型,用于基于模式的时间事实提取

A Probabilistic Model with Commonsense Constraints for Pattern-based Temporal Fact Extraction

论文作者

Zhou, Yang, Zhao, Tong, Jiang, Meng

论文摘要

指定和/或生成用于从非结构化数据中提取事实信息的文本模式(例如,国家总统的总统)。基于模式的信息提取方法的效率和可传递性已被认可。但是,并非每种模式都是可靠的:一个主要的挑战是从不同的,有时甚至相互矛盾的提取中得出最完整,最准确的事实。在这项工作中,我们提出了一个概率图形模型,该模型在生成过程中提出了事实提取。它会自动不受任何监督而无需进行真实事实和模式可靠性。它具有两种新颖的设计,专门用于时间事实:(1)在两种类型的时间信号上模拟可靠性,包括文本和文本生成时间的时间标签; (2)IT将常识约束建模为可观察的变量。实验结果表明,我们的模型明显优于从新闻数据中提取真正的时间事实的现有方法。

Textual patterns (e.g., Country's president Person) are specified and/or generated for extracting factual information from unstructured data. Pattern-based information extraction methods have been recognized for their efficiency and transferability. However, not every pattern is reliable: A major challenge is to derive the most complete and accurate facts from diverse and sometimes conflicting extractions. In this work, we propose a probabilistic graphical model which formulates fact extraction in a generative process. It automatically infers true facts and pattern reliability without any supervision. It has two novel designs specially for temporal facts: (1) it models pattern reliability on two types of time signals, including temporal tag in text and text generation time; (2) it models commonsense constraints as observable variables. Experimental results demonstrate that our model significantly outperforms existing methods on extracting true temporal facts from news data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源