论文标题

广义隐藏的半马尔科夫模型中的随机变分方法,以表征随机杂聚合物中的功能

Stochastic Variational Methods in Generalized Hidden Semi-Markov Models to Characterize Functionality in Random Heteropolymers

论文作者

Zhou, Yun, Gong, Boying, Jiang, Tao, Xu, Ting, Huang, Haiyan

论文摘要

近年来,使用合成聚合物的生物功能材料的发展取得了重大进展。对于大多数生物材料而言,难以捉摸的序列 - 功能关系关系日益增长的问题促使研究人员寻求更有效的工具和分析方法。在这项研究中,统计模型用于研究最近报道的随机杂聚合物(RHP)的序列特征,后者像天然质子通道一样选择性地跨脂质双层传递质子。我们利用了概率图形模型框架,并开发了一种广义的隐藏半马尔科夫模型(GHSMM-RHP)来提取功能确定的序列特征,包括链中的跨膜段以及不同链之间的序列异质性。我们开发了随机变化方法,以有效地推断参数估计和预测,并从贝叶斯(即随机变化贝叶斯)与频繁主义者(即,随机变化预期的最大化)框架从经验研究其计算性能。实际数据结果与实验室实验非常吻合,并表明GHSMM-RHP在聚合物链水平上预测蛋白质样行为方面的潜力。

Recent years have seen substantial advances in the development of biofunctional materials using synthetic polymers. The growing problem of elusive sequence-functionality relations for most biomaterials has driven researchers to seek more effective tools and analysis methods. In this study, statistical models are used to study sequence features of the recently reported random heteropolymers (RHP), which transport protons across lipid bilayers selectively and rapidly like natural proton channels. We utilized the probabilistic graphical model framework and developed a generalized hidden semi-Markov model (GHSMM-RHP) to extract the function-determining sequence features, including the transmembrane segments within a chain and the sequence heterogeneity among different chains. We developed stochastic variational methods for efficient inference on parameter estimation and predictions, and empirically studied their computational performance from a comparative perspective on Bayesian (i.e., stochastic variational Bayes) versus frequentist (i.e., stochastic variational expectation-maximization) frameworks that have been studied separately before. The real data results agree well with the laboratory experiments, and suggest GHSMM-RHP's potential in predicting protein-like behavior at the polymer-chain level.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源