基准测试自动化临床语言简化：数据集，算法和评估

论文标题

基准测试自动化临床语言简化：数据集，算法和评估

Benchmarking Automated Clinical Language Simplification: Dataset, Algorithm, and Evaluation

论文作者

Luo, Junyu, Zheng, Zifei, Ye, Hanzhong, Ye, Muchao, Wang, Yaqing, You, Quanzeng, Xiao, Cao, Ma, Fenglong

论文摘要

健康素养低的患者通常很难理解医疗术语和专业医学语言的复杂结构。尽管提出了一些研究将专家语言自动转化为外行可理解的语言，但其中只有少数人同时着眼于临床领域的准确性和可读性方面。因此，简化临床语言仍然是一项具有挑战性的任务，但不幸的是，它在以前的工作中尚未完全解决。为了基准这项任务，我们构建了一个名为Medlane的新数据集，以支持自动化临床语言简化方法的开发和评估。此外，我们提出了一个名为“声明”的新模型，该模型遵循人类注释程序并与八个强基础相比，并实现了最先进的性能。为了公平地评估性能，我们还提出了三个特定的评估指标。实验结果证明了带注释的Medlane数据集的实用性以及所提出的模型声明的有效性。

Patients with low health literacy usually have difficulty understanding medical jargon and the complex structure of professional medical language. Although some studies are proposed to automatically translate expert language into layperson-understandable language, only a few of them focus on both accuracy and readability aspects simultaneously in the clinical domain. Thus, simplification of the clinical language is still a challenging task, but unfortunately, it is not yet fully addressed in previous work. To benchmark this task, we construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches. Besides, we propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance compared with eight strong baselines. To fairly evaluate the performance, we also propose three specific evaluation metrics. Experimental results demonstrate the utility of the annotated MedLane dataset and the effectiveness of the proposed model DECLARE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题