论文标题
对药物反应预测的多摩学潜在表示神经网络体系结构的公平实验比较
A Fair Experimental Comparison of Neural Network Architectures for Latent Representations of Multi-Omics for Drug Response Prediction
论文作者
论文摘要
近年来,人们看到了新型神经网络体系结构的激增,用于集成多摩管数据以进行预测。大多数架构都包括单独编码器或编码器和解码器,即各种自动编码器,以将多摩s数据转换为潜在表示。一个重要的参数是集成的深度:计算或合并潜在表示的点,可以是早期,中间或晚期。关于集成方法的文献正在稳步增长,但是,在公平的实验条件下以及考虑不同用例中的这些方法的相对性能几乎没有什么了解。我们开发了一个比较框架,该框架在均等条件下训练和优化了多摩斯集成方法。我们结合了早期整合,最近出版了四种深度学习方法:Moli,Super.felt,Omiembed和MoMA。此外,我们设计了一种新颖的方法,即OMICS堆叠,结合了中间和晚期整合的优势。实验是在具有多个OMIC数据(体细胞突变,体拷贝数谱和基因表达谱)的公共药物反应数据集上进行的,该数据是从细胞系,患者衍生的异种移植物和患者样本中获得的。我们的实验证实,早期整合的预测性能最低。总体而言,整合三胞胎损失的体系结构取得了最佳结果。总体而言,统计差异很少可以观察到方法的平均等级,而Super.FELT在交叉验证设置中始终表现最好,并且在外部测试集设置中最好地堆叠OMIC。所有实验的源代码均可在\ url {https://github.com/kramerlab/multi-omics_analysis下获得。
Recent years have seen a surge of novel neural network architectures for the integration of multi-omics data for prediction. Most of the architectures include either encoders alone or encoders and decoders, i.e., autoencoders of various sorts, to transform multi-omics data into latent representations. One important parameter is the depth of integration: the point at which the latent representations are computed or merged, which can be either early, intermediate, or late. The literature on integration methods is growing steadily, however, close to nothing is known about the relative performance of these methods under fair experimental conditions and under consideration of different use cases. We developed a comparison framework that trains and optimizes multi-omics integration methods under equal conditions. We incorporated early integration and four recently published deep learning methods: MOLI, Super.FELT, OmiEmbed, and MOMA. Further, we devised a novel method, Omics Stacking, that combines the advantages of intermediate and late integration. Experiments were conducted on a public drug response data set with multiple omics data (somatic point mutations, somatic copy number profiles and gene expression profiles) that was obtained from cell lines, patient-derived xenografts, and patient samples. Our experiments confirmed that early integration has the lowest predictive performance. Overall, architectures that integrate triplet loss achieved the best results. Statistical differences can, overall, rarely be observed, however, in terms of the average ranks of methods, Super.FELT is consistently performing best in a cross-validation setting and Omics Stacking best in an external test set setting. The source code of all experiments is available under \url{https://github.com/kramerlab/Multi-Omics_analysis}