论文标题

通过神经机器翻译模型改善患者教育材料中的健康素养

Toward Improving Health Literacy in Patient Education Materials with Neural Machine Translation Models

论文作者

Oniani, David, Sreekumar, Sreekanth, DeAlmeida, Renuk, DeAlmeida, Dinuk, Hui, Vivian, Lee, Young Ji, Zhang, Yiye, Zhou, Leming, Wang, Yanshan

论文摘要

健康素养是2030年健康人民的主要重点,这是美国国家目标和目标的第五次迭代。健康素养较低的人通常会遵循访问后的说明以及使用处方,这会导致健康结果和严重的健康差异,这通常很难理解健康信息。在这项研究中,我们建议通过在给定句子中自动翻译文盲语言来利用自然语言处理技术来提高患者教育材料的健康素养。我们从四个在线健康信息网站上刮擦了患者教育材料:medlineplus.gov,drugs.com,mayoclinic.org和reddit.com。我们分别在银标准培训数据集和金标准测试数据集上培训并测试了最先进的神经机器翻译(NMT)模型。实验结果表明,基于变压器(BERT)基于NMT模型的双向长期记忆(BILSTM)NMT模型的表现优于双向编码器表示。我们还验证了NMT模型通过比较句子中的健康文盲语言比例来翻译健康文盲语言的有效性。提出的NMT模型能够识别正确的复杂单词并将其简化为外行语言,同时该模型遭受句子完整性,流利性,可读性的影响,并且难以翻译某些医学术语。

Health literacy is the central focus of Healthy People 2030, the fifth iteration of the U.S. national goals and objectives. People with low health literacy usually have trouble understanding health information, following post-visit instructions, and using prescriptions, which results in worse health outcomes and serious health disparities. In this study, we propose to leverage natural language processing techniques to improve health literacy in patient education materials by automatically translating illiterate languages in a given sentence. We scraped patient education materials from four online health information websites: MedlinePlus.gov, Drugs.com, Mayoclinic.org and Reddit.com. We trained and tested the state-of-the-art neural machine translation (NMT) models on a silver standard training dataset and a gold standard testing dataset, respectively. The experimental results showed that the Bidirectional Long Short-Term Memory (BiLSTM) NMT model outperformed Bidirectional Encoder Representations from Transformers (BERT)-based NMT models. We also verified the effectiveness of NMT models in translating health illiterate languages by comparing the ratio of health illiterate language in the sentence. The proposed NMT models were able to identify the correct complicated words and simplify into layman language while at the same time the models suffer from sentence completeness, fluency, readability, and have difficulty in translating certain medical terms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源