论文标题

使用长期短期存储网络在医疗保险数据上的长期记忆网络预测再入院率的进步

Advances in Prediction of Readmission Rates Using Long Term Short Term Memory Networks on Healthcare Insurance Data

论文作者

Khalid, Shuja, Matos, Francisco, Abunimer, Ayman, Bartlett, Joel, Duszak, Richard, Horny, Michal, Gichoya, Judy, Banerjee, Imon, Trivedi, Hari

论文摘要

30天的医院再入院是一个长期存在的医疗问题,会影响患者的发病率和死亡率,每年造成数十亿美元的损失。最近,已经创建了机器学习模型来预测特定疾病患者的住院再入院风险,但是不存在任何模型来预测所有患者的风险。我们开发了一个双向长期记忆(LSTM)网络,该网络能够使用随时可用的保险数据(住院就诊,门诊就诊和药物处方)来预测任何入院患者的30天重新入选,无论其原因如何。使用历史,住院和入院后数据时,表现最佳模型的ROC AUC为0.763(0.011)。 LSTM模型明显优于基线随机森林分类器,这表明了解事件的顺序对于模型预测很重要。与仅住院数据相比,将30天的历史数据纳入也显着改善了模型性能,这表明患者入院前的临床病史(包括门诊就诊和药房数据)是重新入院的重要贡献。我们的结果表明,机器学习模型能够使用结构化保险计费数据以合理的准确性来预测住院再入院的风险。由于可以从现场提取计费数据或同等代理人,因此可以部署这种模型以识别有入院风险的患者,或者在出院后将更强大的随访(更接近随访,家庭健康,邮寄药物)分配到处于危险中的患者。

30-day hospital readmission is a long standing medical problem that affects patients' morbidity and mortality and costs billions of dollars annually. Recently, machine learning models have been created to predict risk of inpatient readmission for patients with specific diseases, however no model exists to predict this risk across all patients. We developed a bi-directional Long Short Term Memory (LSTM) Network that is able to use readily available insurance data (inpatient visits, outpatient visits, and drug prescriptions) to predict 30 day re-admission for any admitted patient, regardless of reason. The top-performing model achieved an ROC AUC of 0.763 (0.011) when using historical, inpatient, and post-discharge data. The LSTM model significantly outperformed a baseline random forest classifier, indicating that understanding the sequence of events is important for model prediction. Incorporation of 30-days of historical data also significantly improved model performance compared to inpatient data alone, indicating that a patients clinical history prior to admission, including outpatient visits and pharmacy data is a strong contributor to readmission. Our results demonstrate that a machine learning model is able to predict risk of inpatient readmission with reasonable accuracy for all patients using structured insurance billing data. Because billing data or equivalent surrogates can be extracted from sites, such a model could be deployed to identify patients at risk for readmission before they are discharged, or to assign more robust follow up (closer follow up, home health, mailed medications) to at-risk patients after discharge.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源