通过多研究因子分析估算多研究数据的高斯图形模型

论文标题

通过多研究因子分析估算多研究数据的高斯图形模型

Estimating Gaussian graphical models of multi-study data with Multi-Study Factor Analysis

论文作者

Shutta, Katherine H., Scholtens, Denise M., Lowe Jr., William L., Balasubramanian, Raji, De Vito, Roberta

论文摘要

网络模型是从复杂的生物学数据中获得新见解的强大工具。生物学中的大多数调查措施都涉及比较在多个研究或条件（多研究数据）中测量相同预测因子的环境中的数据集。因此，开发用于多研究数据网络建模的统计工具是一个高度活跃的研究领域。多研究因素分析（MSFA）是估计多研究数据中潜在变量（因子）的方法。在这项工作中，我们通过添加能力估计高斯图形模型（GGM）来概括无国界万期无生产力。我们的新工具MSFA-X是用于多研究数据中共享和研究特定信号的潜在可变图形建模的框架。我们通过模拟证明，MSFA-X可以恢复共享和研究的GGM，并优于图形套索基准。我们应用MSFA-X分析来自高血糖和不良妊娠结局的靶向代谢组谱中对口服葡萄糖耐量测试的反应（HAPO）研究，从而鉴定患有和没有妊娠糖尿病的女性之间的葡萄糖代谢网络级别差异。

Network models are powerful tools for gaining new insights from complex biological data. Most lines of investigation in biology involve comparing datasets in the setting where the same predictors are measured across multiple studies or conditions (multi-study data). Consequently, the development of statistical tools for network modeling of multi-study data is a highly active area of research. Multi-study factor analysis (MSFA) is a method for estimation of latent variables (factors) in multi-study data. In this work, we generalize MSFA by adding the capacity to estimate Gaussian graphical models (GGMs). Our new tool, MSFA-X, is a framework for latent variable-based graphical modeling of shared and study-specific signals in multi-study data. We demonstrate through simulation that MSFA-X can recover shared and study-specific GGMs and outperforms a graphical lasso benchmark. We apply MSFA-X to analyze maternal response to an oral glucose tolerance test in targeted metabolomic profiles from the Hyperglycemia and Adverse Pregnancy Outcomes (HAPO) Study, identifying network-level differences in glucose metabolism between women with and without gestational diabetes mellitus.

下载PDF全文

下载文献需遵守相关版权规定

论文标题