论文标题

学术文档中的文档级定义检测:现有模型,错误分析和未来方向

Document-Level Definition Detection in Scholarly Documents: Existing Models, Error Analyses, and Future Directions

论文作者

Kang, Dongyeop, Head, Andrew, Sidhu, Risham, Lo, Kyle, Weld, Daniel S., Hearst, Marti A.

论文摘要

定义检测的任务对于学术论文很重要,因为论文通常利用读者可能不熟悉的技术术语。尽管先前在定义检测方面进行了工作,但当前的方法远非足够准确,可以在现实世界应用中使用。在本文中,我们首先对当前最佳性能定义检测系统进行深入的误差分析,并发现错误的主要原因。基于此分析,我们开发了一个新的定义检测系统Heddex,该系统利用句法特征,变压器编码器和启发式过滤器,并在标准句子级别的基准测试中进行评估。由于当前的基准测试评估了随机采样的句子,因此我们提出了一项替代评估,评估文档中的每个句子。这还可以评估除精度之外的召回。 Heddex在句子级别和文档级任务上的表现分别以12.7 F1点和14.4 F1点的优于领先的系统。我们注意到,由于有必要将文档结构作为功能纳入,因此在高核记录级任务上的性能远低于标准评估方法。我们讨论了文档级定义检测,改进思想以及用于开发阅读援助应用程序的潜在问题的剩余挑战。

The task of definition detection is important for scholarly papers, because papers often make use of technical terminology that may be unfamiliar to readers. Despite prior work on definition detection, current approaches are far from being accurate enough to use in real-world applications. In this paper, we first perform in-depth error analysis of the current best performing definition detection system and discover major causes of errors. Based on this analysis, we develop a new definition detection system, HEDDEx, that utilizes syntactic features, transformer encoders, and heuristic filters, and evaluate it on a standard sentence-level benchmark. Because current benchmarks evaluate randomly sampled sentences, we propose an alternative evaluation that assesses every sentence within a document. This allows for evaluating recall in addition to precision. HEDDEx outperforms the leading system on both the sentence-level and the document-level tasks, by 12.7 F1 points and 14.4 F1 points, respectively. We note that performance on the high-recall document-level task is much lower than in the standard evaluation approach, due to the necessity of incorporation of document structure as features. We discuss remaining challenges in document-level definition detection, ideas for improvements, and potential issues for the development of reading aid applications.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源