论文标题

自然启发工程的科学信息提取数据集

A Scientific Information Extraction Dataset for Nature Inspired Engineering

论文作者

Kruiper, Ruben, Vincent, Julian F. V., Chen-Burger, Jessica, Desmulliez, Marc P. Y., Konstas, Ioannis

论文摘要

大自然启发了从机器人技术到航空工程以及医疗设备的制造的应用中的各种开创性技术发展。但是,访问科学生物学文本中捕获的信息是一项耗时且艰巨的任务,需要特定于领域的知识。改善局外人的访问可以帮助跨学科研究,例如自然启发的工程学。本文介绍了1,500个手动注销的句子的数据集,这些句子在科学生物学文本中表达了域中的核心概念之间无关的关系,例如权衡和相关性。这些关系的参数可以是多词表达式,并已用修改短语注释以形成非标记图。该数据集允许培训和评估关系提取算法,该算法旨在大量的科学生物文档键入,从而为工程师提供了高级过滤器。

Nature has inspired various ground-breaking technological developments in applications ranging from robotics to aerospace engineering and the manufacturing of medical devices. However, accessing the information captured in scientific biology texts is a time-consuming and hard task that requires domain-specific knowledge. Improving access for outsiders can help interdisciplinary research like Nature Inspired Engineering. This paper describes a dataset of 1,500 manually-annotated sentences that express domain-independent relations between central concepts in a scientific biology text, such as trade-offs and correlations. The arguments of these relations can be Multi Word Expressions and have been annotated with modifying phrases to form non-projective graphs. The dataset allows for training and evaluating Relation Extraction algorithms that aim for coarse-grained typing of scientific biological documents, enabling a high-level filter for engineers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源