论文标题
基因 - 基因的相互作用分析通过结构化的贝叶斯方法结合了网络信息
Gene-gene interaction analysis incorporating network information via a structured Bayesian approach
论文作者
论文摘要
越来越多的证据表明,基因基因相互作用对人类疾病的生物学过程具有重要影响。由于遗传测量的高维度,现有的相互作用分析方法通常缺乏足够的信息,并且仍然不令人满意。生物网络已经大量积累,使研究人员可以通过使用网络选择(由功能相关的生物标志物)以及网络结构从系统角度识别生物标志物。在主要效应分析中,网络信息已被广泛合并,从而导致生物学上更有意义,更准确的估计值。但是,在交互分析的背景下仍然存在很大的差距。在这项研究中,我们开发了一种新型的结构化贝叶斯交互分析方法,可以有效地纳入网络信息。这项研究是最早在表型预测网络选择的帮助下识别基因 - 基因相互作用的研究者之一,同时同时满足了基础网络结构。它在主要效果,交互和网络之间进行了创新尊重多个层次结构。采用了贝叶斯方法,该方法已被证明比其他一些技术具有多个优势。开发了有效的变分推理算法来探索后验分布。广泛的仿真研究表明了所提出的方法的实际优势。对黑色素瘤和肺癌的TCGA数据的分析导致具有令人满意的预测准确性和选择稳定性的生物学上明智的发现。
Increasing evidence has shown that gene-gene interactions have important effects on biological processes of human diseases. Due to the high dimensionality of genetic measurements, existing interaction analysis methods usually suffer from a lack of sufficient information and are still unsatisfactory. Biological networks have been massively accumulated, allowing researchers to identify biomarkers from a system perspective by utilizing network selection (consisting of functionally related biomarkers) as well as network structures. In the main-effect analysis, network information has been widely incorporated, leading to biologically more meaningful and more accurate estimates. However, there is still a big gap in the context of interaction analysis. In this study, we develop a novel structured Bayesian interaction analysis approach, effectively incorporating the network information. This study is among the first to identify gene-gene interactions with the assistance of network selection for phenotype prediction, while simultaneously accommodating the underlying network structures. It innovatively respects the multiple hierarchies among main effects, interactions, and networks. Bayesian method is adopted, which has been shown to have multiple advantages over some other techniques. An efficient variational inference algorithm is developed to explore the posterior distribution. Extensive simulation studies demonstrate the practical superiority of the proposed approach. The analysis of TCGA data on melanoma and lung cancer leads to biologically sensible findings with satisfactory prediction accuracy and selection stability.