论文标题
大规模样品大小生存分析的大规模平行化
Massive Parallelization of Massive Sample-size Survival Analysis
论文作者
论文摘要
大规模的观察健康数据库越来越受欢迎,可以对医疗产品进行比较有效性和安全性研究。但是,在此类研究中拟合存活率回归模型时,越来越多的患者会带来计算挑战。在本文中,我们使用图形处理单元(GPU)并平行大量样本大小生存分析的计算瓶颈。具体而言,我们为COX比例危害模型和前后背部平行扫描算法开发和应用时间和记忆有效的单通行平行扫描算法,用于使用环状坐标坐标下降优化方法进行分析的精细灰色模型。我们证明,与传统的多核CPU并行性相比,GPU通过数量级将这些复杂模型拟合在大数据库中的计算加速了计算。我们的实施实现了有效的大规模观察研究,涉及数百万患者和数千名患者特征。上述实现可在开源R包独立游戏中获得(Suchard等,2013)。
Large-scale observational health databases are increasingly popular for conducting comparative effectiveness and safety studies of medical products. However, increasing number of patients poses computational challenges when fitting survival regression models in such studies. In this paper, we use graphics processing units (GPUs) to parallelize the computational bottlenecks of massive sample-size survival analyses. Specifically, we develop and apply time- and memory-efficient single-pass parallel scan algorithms for Cox proportional hazards models and forward-backward parallel scan algorithms for Fine-Gray models for analysis with and without a competing risk using a cyclic coordinate descent optimization approach. We demonstrate that GPUs accelerate the computation of fitting these complex models in large databases by orders of magnitude as compared to traditional multi-core CPU parallelism. Our implementation enables efficient large-scale observational studies involving millions of patients and thousands of patient characteristics. The above implementation is available in the open-source R package Cyclops (Suchard et al., 2013).