将实体与任意模式联系在一起

论文标题

将实体与任意模式联系在一起

Linking Entities to Unseen Knowledge Bases with Arbitrary Schemas

论文作者

Vyas, Yogarshi, Ballesteros, Miguel

论文摘要

在实体链接中，对原始文本中指定的实体的提及是对知识库（KB）的歧义。这项工作着重于链接到没有培训数据且在培训期间未知的架构的看不见的KB链接。我们的方法依赖于将实体从具有多个属性值对的任意KBS灵活转换为平面字符串的方法，我们将其与最先进的模型一起使用，以零拍。为了改善模型的概括，我们根据实体属性的改组和对看不见的属性的处理使用两个正则化方案。在英语数据集上进行了在Conll数据集上训练模型并在TAC-KBP 2010数据集上进行测试的实验表明，我们的模型的表现优于基线模型超过12点的准确性。与先前的工作不同，我们的方法还允许无缝组合多个培训数据集。我们通过添加完全不同的数据集（Wikia）以及增加了TAC-KBP 2010培训集中的培训数据来测试此功能。我们的模型全面表现出色。

In entity linking, mentions of named entities in raw text are disambiguated against a knowledge base (KB). This work focuses on linking to unseen KBs that do not have training data and whose schema is unknown during training. Our approach relies on methods to flexibly convert entities from arbitrary KBs with several attribute-value pairs into flat strings, which we use in conjunction with state-of-the-art models for zero-shot linking. To improve the generalization of our model, we use two regularization schemes based on shuffling of entity attributes and handling of unseen attributes. Experiments on English datasets where models are trained on the CoNLL dataset, and tested on the TAC-KBP 2010 dataset show that our models outperform baseline models by over 12 points of accuracy. Unlike prior work, our approach also allows for seamlessly combining multiple training datasets. We test this ability by adding both a completely different dataset (Wikia), as well as increasing amount of training data from the TAC-KBP 2010 training set. Our models perform favorably across the board.

下载PDF全文

下载文献需遵守相关版权规定

论文标题