论文标题

医学案例报告中的命名实体:语料库和实验

Named Entities in Medical Case Reports: Corpus and Experiments

论文作者

Schulz, Sarah, Ševa, Jurica, Rodriguez, Samuel, Ostendorff, Malte, Rehm, Georg

论文摘要

我们提出了一个新的语料库,其中包括医疗实体的注释,但起源于PubMed Central的开放访问库。在案例报告中,我们注释案例,条件,发现,因素和否定符。此外,在适用的情况下,我们注释了这些实体之间的关系。因此,这是科学界用英语提供的第一个语料库。它可以通过命名实体识别,关系提取和(句子/段落)相关性检测等任务对自动信息提取的初步调查。此外,我们提出了四个强大的基线系统,用于检测通过注释数据集提供的医疗实体。

We present a new corpus comprising annotations of medical entities in case reports, originating from PubMed Central's open access library. In the case reports, we annotate cases, conditions, findings, factors and negation modifiers. Moreover, where applicable, we annotate relations between these entities. As such, this is the first corpus of this kind made available to the scientific community in English. It enables the initial investigation of automatic information extraction from case reports through tasks like Named Entity Recognition, Relation Extraction and (sentence/paragraph) relevance detection. Additionally, we present four strong baseline systems for the detection of medical entities made available through the annotated dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源