分子表示通过异质基序图神经网络学习

论文标题

分子表示通过异质基序图神经网络学习

Molecular Representation Learning via Heterogeneous Motif Graph Neural Networks

论文作者

Yu, Zhaoning, Gao, Hongyang

论文摘要

我们考虑分子图的特征表示学习问题。图神经网络已被广泛用于分子图的特征表示学习。但是，大多数现有方法在忽略其连接的同时（例如主题级关系）分别处理分子图。我们通过构建一个异质基序图来解决这个问题，提出了一种新型的分子图表示学习方法。特别是，我们构建了一个包含基序节点和分子节点的异质基序图。每个基序节点对应于从分子中提取的基序。然后，我们提出了一个异质基序图神经网络（HM-gnn），以学习异质基序图中每个节点的特征表示。我们的异质基序图还可以实现有效的多任务学习，尤其是对于小分子数据集。为了解决潜在的效率问题，我们建议使用边缘采样器，这可以大大减少计算资源的使用。实验结果表明，我们的模型始终优于先前的最新模型。在多任务设置下，我们在组合数据集上的方法的有前途的表现使小分子数据集的新学习范式阐明了灯光。最后，我们表明我们的模型通过使用Edge Sampler，具有相似的性能，其计算资源的表现明显较少。

We consider feature representation learning problem of molecular graphs. Graph Neural Networks have been widely used in feature representation learning of molecular graphs. However, most existing methods deal with molecular graphs individually while neglecting their connections, such as motif-level relationships. We propose a novel molecular graph representation learning method by constructing a heterogeneous motif graph to address this issue. In particular, we build a heterogeneous motif graph that contains motif nodes and molecular nodes. Each motif node corresponds to a motif extracted from molecules. Then, we propose a Heterogeneous Motif Graph Neural Network (HM-GNN) to learn feature representations for each node in the heterogeneous motif graph. Our heterogeneous motif graph also enables effective multi-task learning, especially for small molecular datasets. To address the potential efficiency issue, we propose to use an edge sampler, which can significantly reduce computational resources usage. The experimental results show that our model consistently outperforms previous state-of-the-art models. Under multi-task settings, the promising performances of our methods on combined datasets shed light on a new learning paradigm for small molecular datasets. Finally, we show that our model achieves similar performances with significantly less computational resources by using our edge sampler.

下载PDF全文

下载文献需遵守相关版权规定

论文标题