低分辨率的蛋白质结构分类的转移学习

论文标题

低分辨率的蛋白质结构分类的转移学习

Transfer Learning for Protein Structure Classification at Low Resolution

论文作者

Hudson, Alexander, Gong, Shaogang

论文摘要

结构测定是理解分子水平蛋白质功能的关键。尽管在预测氨基酸序列的结构和功能方面取得了重大进展，但研究人员仍然必须依靠昂贵的，耗时的分析方法来可视化详细的蛋白质构象。在这项研究中，我们证明，使用对高分辨率（$ \ \ leq $ 3A）结构培训的深卷积神经网络，可以从低（$> $ 3a）分辨率确定的结构中对蛋白质类别和架构进行准确的预测（$ \ geq $ 80％）。因此，我们为低分辨率的高速，低成本蛋白质结构分类提供了概念证明，以及扩展功能预测的基础。我们研究了输入表示对分类性能的影响，这表明侧链信息对于细粒结构预测可能不是必需的。最后，我们确认高分辨率，低分辨率和NMR确定的结构居住在一个共同的特征空间中，因此为单像超分辨率提供了理论基础。

Structure determination is key to understanding protein function at a molecular level. Whilst significant advances have been made in predicting structure and function from amino acid sequence, researchers must still rely on expensive, time-consuming analytical methods to visualise detailed protein conformation. In this study, we demonstrate that it is possible to make accurate ($\geq$80%) predictions of protein class and architecture from structures determined at low ($>$3A) resolution, using a deep convolutional neural network trained on high-resolution ($\leq$3A) structures represented as 2D matrices. Thus, we provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function. We investigate the impact of the input representation on classification performance, showing that side-chain information may not be necessary for fine-grained structure predictions. Finally, we confirm that high-resolution, low-resolution and NMR-determined structures inhabit a common feature space, and thus provide a theoretical foundation for boosting with single-image super-resolution.

下载PDF全文

下载文献需遵守相关版权规定

论文标题