论文标题
多语言疾病生物医学本体论
Multilingual enrichment of disease biomedical ontologies
论文作者
论文摘要
翻译生物医学本体是一个重要的挑战,但手动需要花费很多时间和金钱。我们研究使用开源知识库来翻译生物医学本体论的可能性。我们专注于两个方面:覆盖范围和质量。我们研究了两个生物医学本体论的覆盖范围,这些本体论着针对9种欧洲语言的Wikidata(捷克,荷兰语,英语,法语,德语,意大利语,波兰语,葡萄牙语和西班牙语)的疾病,以及第二种的阿拉伯语,中文和俄语。我们首先使用Wikidata与研究的本体论之间的直接链接,然后通过浏览其他中间本体来使用二阶链接。然后,我们将由于Wikidata与商用机器翻译工具(此处的Google Cloud Translation)进行了比较,因此对Wikidata获得的翻译质量进行了比较。
Translating biomedical ontologies is an important challenge, but doing it manually requires much time and money. We study the possibility to use open-source knowledge bases to translate biomedical ontologies. We focus on two aspects: coverage and quality. We look at the coverage of two biomedical ontologies focusing on diseases with respect to Wikidata for 9 European languages (Czech, Dutch, English, French, German, Italian, Polish, Portuguese and Spanish) for both ontologies, plus Arabic, Chinese and Russian for the second one. We first use direct links between Wikidata and the studied ontologies and then use second-order links by going through other intermediate ontologies. We then compare the quality of the translations obtained thanks to Wikidata with a commercial machine translation tool, here Google Cloud Translation.