论文标题
时间序列分析和使用LSTM和ARIMA模型对COVID-19案例的预测
Time Series Analysis and Forecasting of COVID-19 Cases Using LSTM and ARIMA Models
论文作者
论文摘要
2019年冠状病毒病(COVID-19)是一次全球公共卫生危机,世界卫生组织宣布为大流行。预测国家的Covid-19案件是为了帮助决策者和医疗保健提供者为未来做准备的必要条件。这项研究探讨了几种长期短期记忆(LSTM)模型的性能和自动回归综合运动平均值(ARIMA)模型,以预测确认的COVID-19案例的数量。使用多种LSTM型号和Arima的每日累积Covid-19病例的时间序列用于生成1天,3天和为期5天的预测。开发了两个新型的K期性能指标 - K天平均绝对百分比误差(KMAPE)和K天中值对称精度(KMDSA) - 开发了用于评估模型在多天的预测时间序列值中的性能。使用KMAPE和KMDSA的LSTM模型的预测错误均低至0.05%,而Arima的预测误差分别为0.07%和0.06%。 LSTM模型略微低估了,而Arima略微高估了预测中的数字。 LSTM模型的性能与预测Covid-19情况下的Arima相当。尽管Arima需要更长的序列,但LSTM的性能可以很好地完成,序列大小小至3。但是,LSTMS需要大量的训练样本。此外,提出的K期性能指标的开发对于预测多个时期的时间序列模型的性能评估可能很有用。基于提出的K期性能指标,LSTM和ARIMA都可用于时间序列分析和COVID-19的预测。
Coronavirus disease 2019 (COVID-19) is a global public health crisis that has been declared a pandemic by World Health Organization. Forecasting country-wise COVID-19 cases is necessary to help policymakers and healthcare providers prepare for the future. This study explores the performance of several Long Short-Term Memory (LSTM) models and Auto-Regressive Integrated Moving Average (ARIMA) model in forecasting the number of confirmed COVID-19 cases. Time series of daily cumulative COVID-19 cases were used for generating 1-day, 3-day, and 5-day forecasts using several LSTM models and ARIMA. Two novel k-period performance metrics - k-day Mean Absolute Percentage Error (kMAPE) and k-day Median Symmetric Accuracy (kMdSA) - were developed for evaluating the performance of the models in forecasting time series values for multiple days. Errors in prediction using kMAPE and kMdSA for LSTM models were both as low as 0.05%, while those for ARIMA were 0.07% and 0.06% respectively. LSTM models slightly underestimated while ARIMA slightly overestimated the numbers in the forecasts. The performance of LSTM models is comparable to ARIMA in forecasting COVID-19 cases. While ARIMA requires longer sequences, LSTMs can perform reasonably well with sequence sizes as small as 3. However, LSTMs require a large number of training samples. Further, the development of k-period performance metrics proposed is likely to be useful for performance evaluation of time series models in predicting multiple periods. Based on the k-period performance metrics proposed, both LSTMs and ARIMA are useful for time series analysis and forecasting for COVID-19.