论文标题

用于分子财产预测和药物发现的高级图和序列神经网络

Advanced Graph and Sequence Neural Networks for Molecular Property Prediction and Drug Discovery

论文作者

Wang, Zhengyang, Liu, Meng, Luo, Youzhi, Xu, Zhao, Xie, Yaochen, Wang, Limei, Cai, Lei, Qi, Qi, Yuan, Zhuoning, Yang, Tianbao, Ji, Shuiwang

论文摘要

分子的特性表示其功能,因此在许多应用中很有用。随着深度学习方法的进步,预测分子特性的计算方法正在增加动量。但是,目前缺少定制和高级方法和全面的工具。在这里,我们开发了一系列全面的机器学习方法和工具,这些方法涵盖了不同的计算模型,分子表示以及分子财产预测和药物发现的损失函数。具体而言,我们表示分子作为图和序列。基于这些表示形式,我们开发了从分子图和序列学习的新型深层模型。为了从高度不平衡的数据集中有效学习,我们开发了高级损失功能,以优化精确曲线下的区域。总的来说,我们的工作不仅是一种综合工具,而且有助于开发新颖的图形和序列学习方法。在线和离线抗生素发现和分子财产预测任务上的结果表明,我们的方法比先前的方法持续改进。特别是,我们的方法在AI治疗与COVID-19有关的药物发现方面的ROC-AUC和PRC-AUC方面都取得了#1排名。我们的软件是Advprop下的Moleculex库的一部分发布的。

Properties of molecules are indicative of their functions and thus are useful in many applications. With the advances of deep learning methods, computational approaches for predicting molecular properties are gaining increasing momentum. However, there lacks customized and advanced methods and comprehensive tools for this task currently. Here we develop a suite of comprehensive machine learning methods and tools spanning different computational models, molecular representations, and loss functions for molecular property prediction and drug discovery. Specifically, we represent molecules as both graphs and sequences. Built on these representations, we develop novel deep models for learning from molecular graphs and sequences. In order to learn effectively from highly imbalanced datasets, we develop advanced loss functions that optimize areas under precision-recall curves. Altogether, our work not only serves as a comprehensive tool, but also contributes towards developing novel and advanced graph and sequence learning methodologies. Results on both online and offline antibiotics discovery and molecular property prediction tasks show that our methods achieve consistent improvements over prior methods. In particular, our methods achieve #1 ranking in terms of both ROC-AUC and PRC-AUC on the AI Cures Open Challenge for drug discovery related to COVID-19. Our software is released as part of the MoleculeX library under AdvProp.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源