悬停：用于多跳事实提取和要求验证的数据集

论文标题

悬停：用于多跳事实提取和要求验证的数据集

HoVer: A Dataset for Many-Hop Fact Extraction And Claim Verification

论文作者

Jiang, Yichen, Bordia, Shikha, Zhong, Zheng, Dognin, Charles, Singh, Maneesh, Bansal, Mohit

论文摘要

我们介绍悬停（Hoppy验证），这是用于多跳证据提取和事实验证的数据集。它挑战了模型，以从与索赔相关的几篇Wikipedia文章中提取事实，并分类该索赔是由事实支持还是不支持索赔。在悬停中，索赔要求从多达四个英语维基百科文章中提取证据，并体现各种形状的推理图。此外，大多数3/4-HOP主张中的大多数都是用多个句子写的，这增加了理解远程依赖关系（例如核心）的复杂性。我们表明，随着推理啤酒花的增加，现有的最先进的语义匹配模型的性能在我们的数据集上大大降低了，因此证明了多跳推理的必要性以取得强大的结果。我们希望引入这个具有挑战性的数据集以及随附的评估任务将鼓励在多跳事实检索和信息验证中进行研究。我们在https://hover-nlp.github.io上公开提供悬停数据集。

We introduce HoVer (HOppy VERification), a dataset for many-hop evidence extraction and fact verification. It challenges models to extract facts from several Wikipedia articles that are relevant to a claim and classify whether the claim is Supported or Not-Supported by the facts. In HoVer, the claims require evidence to be extracted from as many as four English Wikipedia articles and embody reasoning graphs of diverse shapes. Moreover, most of the 3/4-hop claims are written in multiple sentences, which adds to the complexity of understanding long-range dependency relations such as coreference. We show that the performance of an existing state-of-the-art semantic-matching model degrades significantly on our dataset as the number of reasoning hops increases, hence demonstrating the necessity of many-hop reasoning to achieve strong results. We hope that the introduction of this challenging dataset and the accompanying evaluation task will encourage research in many-hop fact retrieval and information verification. We make the HoVer dataset publicly available at https://hover-nlp.github.io

下载PDF全文

下载文献需遵守相关版权规定

论文标题