论文标题

一种灵活的预测生物标志物发现方法

A Flexible Approach for Predictive Biomarker Discovery

论文作者

Boileau, Philippe, Qi, Nina Ting, van der Laan, Mark J., Dudoit, Sandrine, Leng, Ning

论文摘要

精确医学的努力是预测生物标志物的发现。他们定义了患者亚群,这些亚群将受益于给定治疗中的大多数或至少受益。这些生物标志物的鉴定通常是相关但根本不同的治疗规则估计任务的副产品。使用治疗规则估计方法在临床试验中鉴定预测性生物标志物,在该试验中,协变量数量超过参与者的数量通常会导致较高的错误发现率。假阳性的数量高于预期的误报数量,转化为进行药物靶标识别和诊断测定开发的随访实验时浪费的资源。患者的结果反过来受到负面影响。我们提出了一个可变的重要性参数,用于直接评估潜在的预测生物标志物的重要性,并为此估计开发灵活的非参数推理程序。我们证明,在数据生成过程的松散条件下,我们的估计器是双重的,渐近线性,允许对重要性度量的有效推断。该方法的统计保证在代表中等和高维协方差载体的随机对照试验的彻底仿真研究中得到了验证。然后,我们的程序用于从最近完成的临床试验中入学的转移性肾细胞癌患者的肿瘤基因表达数据中发现预测性生物标志物。我们发现,与主要目的是治疗规则估计的程序相比,我们的方法更容易从非预测的生物标志物中辨别出预测性。简要介绍了该方法的开源软件实现,即Unicate R软件包。

An endeavor central to precision medicine is predictive biomarker discovery; they define patient subpopulations which stand to benefit most, or least, from a given treatment. The identification of these biomarkers is often the byproduct of the related but fundamentally different task of treatment rule estimation. Using treatment rule estimation methods to identify predictive biomarkers in clinical trials where the number of covariates exceeds the number of participants often results in high false discovery rates. The higher than expected number of false positives translates to wasted resources when conducting follow-up experiments for drug target identification and diagnostic assay development. Patient outcomes are in turn negatively affected. We propose a variable importance parameter for directly assessing the importance of potentially predictive biomarkers, and develop a flexible nonparametric inference procedure for this estimand. We prove that our estimator is double-robust and asymptotically linear under loose conditions on the data-generating process, permitting valid inference about the importance metric. The statistical guarantees of the method are verified in a thorough simulation study representative of randomized control trials with moderate and high-dimensional covariate vectors. Our procedure is then used to discover predictive biomarkers from among the tumor gene expression data of metastatic renal cell carcinoma patients enrolled in recently completed clinical trials. We find that our approach more readily discerns predictive from non-predictive biomarkers than procedures whose primary purpose is treatment rule estimation. An open-source software implementation of the methodology, the uniCATE R package, is briefly introduced.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源