论文标题

深度学习生成定制的逻辑回归模型,以解释如何将乳腺癌亚型分类

Deep learning generates custom-made logistic regression models for explaining how breast cancer subtypes are classified

论文作者

Shibahara, Takuma, Wada, Chisa, Yamashita, Yasuho, Fujita, Kazuhiro, Sato, Masamichi, Kuwata, Junichi, Okamoto, Atsushi, Ono, Yoshimasa

论文摘要

区分乳腺癌的内在亚型对于决定最佳治疗策略至关重要。深度学习可以比常规统计方法更准确地从遗传信息中预测亚型,但是迄今为止,尚未直接利用深度学习来检查哪些基因与哪些亚型相关。为了澄清固有亚型中嵌入的机制,我们开发了一个可解释的深度学习模型,称为点线性(PWL)模型,该模型为每个患者生成定制的逻辑回归。逻辑回归是医生和医学信息学研究人员都熟悉的,它使我们能够分析特征变量的重要性,而PWL模型则利用了逻辑回归的这些实际能力。在这项研究中,我们表明分析乳腺癌亚型对患者有益,也是验证PWL模型能力的最佳方法之一。首先,我们使用RNA-seq数据训练了PWL模型,以预测PAM50固有的亚型,并通过亚型预测任务将其应用于PAM50的41/50基因。其次,我们开发了一种深度富集分析方法,以揭示PAM50亚型与乳腺癌的拷贝数之间的关系。我们的发现表明,PWL模型利用与细胞周期相关途径相关的基因。这些在乳腺癌亚型分析中取得的初步成功表明了我们分析策略的潜力,即阐明乳腺癌的基础机制并改善整体临床结果。

Differentiating the intrinsic subtypes of breast cancer is crucial for deciding the best treatment strategy. Deep learning can predict the subtypes from genetic information more accurately than conventional statistical methods, but to date, deep learning has not been directly utilized to examine which genes are associated with which subtypes. To clarify the mechanisms embedded in the intrinsic subtypes, we developed an explainable deep learning model called a point-wise linear (PWL) model that generates a custom-made logistic regression for each patient. Logistic regression, which is familiar to both physicians and medical informatics researchers, allows us to analyze the importance of the feature variables, and the PWL model harnesses these practical abilities of logistic regression. In this study, we show that analyzing breast cancer subtypes is clinically beneficial for patients and one of the best ways to validate the capability of the PWL model. First, we trained the PWL model with RNA-seq data to predict PAM50 intrinsic subtypes and applied it to the 41/50 genes of PAM50 through the subtype prediction task. Second, we developed a deep enrichment analysis method to reveal the relationships between the PAM50 subtypes and the copy numbers of breast cancer. Our findings showed that the PWL model utilized genes relevant to the cell cycle-related pathways. These preliminary successes in breast cancer subtype analysis demonstrate the potential of our analysis strategy to clarify the mechanisms underlying breast cancer and improve overall clinical outcomes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源