论文标题
对空间疾病映射的分解回归的模拟研究
A simulation study of disaggregation regression for spatial disease mapping
论文作者
论文摘要
分解回归已成为空间疾病映射的重要工具,从而从综合反应数据中对疾病风险进行了细微的预测。通过包括高分辨率的协变量信息并以良好的规模对数据生成过程进行建模,希望这些模型可以准确地学习协变量和响应之间的良好空间比例之间的关系。但是,验证这些高分辨率预测可能是一个挑战,因为在此空间尺度上通常没有观察到的数据。在这项研究中,对各种环境中的模拟数据进行了分类回归,并将结果的细度预测与模拟的地面真相进行了比较。研究性能是通过不同数量的数据点,汇总区域的大小以及模型错误指定级别进行了研究。还研究了交叉验证在骨料水平上的有效性,以作为衡量精细预测性能的量度。随着观察次数的增加和聚合区域的大小减少,预测性能得到了改善。当模型被妥善指定时,即使有少量的观测值和较大的聚集区域,精细的预测也是准确的。在模型错误指定的情况下,对于较大的聚集区域的预测性能明显较差,但是当响应数据在较小的地区汇总时,预测性能仍然很高。总体水平上的交叉验证相关性是对高尺度预测性能的中等良好预测指标。尽管模拟不太可能捕获现实响应数据的细微差别,但本研究可以深入了解不同情况下分类回归的有效性。
Disaggregation regression has become an important tool in spatial disease mapping for making fine-scale predictions of disease risk from aggregated response data. By including high resolution covariate information and modelling the data generating process on a fine scale, it is hoped that these models can accurately learn the relationships between covariates and response at a fine spatial scale. However, validating these high resolution predictions can be a challenge, as often there is no data observed at this spatial scale. In this study, disaggregation regression was performed on simulated data in various settings and the resulting fine-scale predictions are compared to the simulated ground truth. Performance was investigated with varying numbers of data points, sizes of aggregated areas and levels of model misspecification. The effectiveness of cross validation on the aggregate level as a measure of fine-scale predictive performance was also investigated. Predictive performance improved as the number of observations increased and as the size of the aggregated areas decreased. When the model was well-specified, fine-scale predictions were accurate even with small numbers of observations and large aggregated areas. Under model misspecification predictive performance was significantly worse for large aggregated areas but remained high when response data was aggregated over smaller regions. Cross-validation correlation on the aggregate level was a moderately good predictor of fine-scale predictive performance. While the simulations are unlikely to capture the nuances of real-life response data, this study gives insight into the effectiveness of disaggregation regression in different contexts.