论文标题
从电子健康记录说明中自动识别驱逐状态
Automated Identification of Eviction Status from Electronic Health Record Notes
论文作者
论文摘要
目的:驱逐是健康的重要社会和行为决定因素。驱逐与一系列负面事件有关,这些事件可能导致失业,住房不安全感/无家可归,长期贫困和心理健康问题。在这项研究中,我们开发了一种自然语言处理系统,以自动从电子健康记录(EHR)注释中检测出驱逐状态。 材料和方法:我们首先定义了驱逐状态(驱逐存在和驱逐时期),然后在退伍军人卫生管理局(VHA)的5000 EHR注释中注释驱逐状态。我们开发了一种新颖的模型Kiresh,该模型已显示出基本上优于其他最先进的模型,例如Biobert和Bioclinicalbert等微调预训练的语言模型。此外,我们设计了一个新颖的提示,以通过使用驱逐存在和周期预测的两个子任务之间的固有联系进一步提高模型性能。最后,我们使用基于温度缩放的校准在我们的Kiresh-Prompt方法上,以避免不平衡数据集引起的过度信心问题。 Results: KIRESH-Prompt substantially outperformed strong baseline models including fine-tuning the BioClinicalBERT model to achieve 0.74672 MCC, 0.71153 Macro-F1, and 0.83396 Micro-F1 in predicting eviction period and 0.66827 MCC, 0.62734 Macro-F1, and 0.7863 Micro-F1 in predicting eviction presence.我们还对健康数据集的基准社会决定因素(SBDH)数据集进行了其他实验,以证明我们方法的普遍性。 结论和未来的工作:Kiresh-Prompt已大大改善了驱逐状态分类。我们计划将Kiresh推出到VHA EHRS作为驱逐监视系统,以帮助解决美国退伍军人的住房不安全感。
Objective: Evictions are important social and behavioral determinants of health. Evictions are associated with a cascade of negative events that can lead to unemployment, housing insecurity/homelessness, long-term poverty, and mental health problems. In this study, we developed a natural language processing system to automatically detect eviction status from electronic health record (EHR) notes. Materials and Methods: We first defined eviction status (eviction presence and eviction period) and then annotated eviction status in 5000 EHR notes from the Veterans Health Administration (VHA). We developed a novel model, KIRESH, that has shown to substantially outperform other state-of-the-art models such as fine-tuning pre-trained language models like BioBERT and BioClinicalBERT. Moreover, we designed a novel prompt to further improve the model performance by using the intrinsic connection between the two sub-tasks of eviction presence and period prediction. Finally, we used the Temperature Scaling-based Calibration on our KIRESH-Prompt method to avoid over-confidence issues arising from the imbalance dataset. Results: KIRESH-Prompt substantially outperformed strong baseline models including fine-tuning the BioClinicalBERT model to achieve 0.74672 MCC, 0.71153 Macro-F1, and 0.83396 Micro-F1 in predicting eviction period and 0.66827 MCC, 0.62734 Macro-F1, and 0.7863 Micro-F1 in predicting eviction presence. We also conducted additional experiments on a benchmark social determinants of health (SBDH) dataset to demonstrate the generalizability of our methods. Conclusion and Future Work: KIRESH-Prompt has substantially improved eviction status classification. We plan to deploy KIRESH-Prompt to the VHA EHRs as an eviction surveillance system to help address the US Veterans' housing insecurity.