从贝叶斯推论的角度了解图神经网络中的非线性

论文标题

从贝叶斯推论的角度了解图神经网络中的非线性

Understanding Non-linearity in Graph Neural Networks from the Bayesian-Inference Perspective

论文作者

Wei, Rongzhe, Yin, Haoteng, Jia, Junteng, Benson, Austin R., Li, Pan

论文摘要

图形神经网络（GNN）在许多预测任务中表现出优于图形的优势，因为它们在图形结构化数据中捕获非线性关系的能力令人印象深刻。但是，对于节点分类任务，通常只观察到GNN在线性对应物上的边缘改进。以前的作品对这种现象的理解很少。在这项工作中，我们求助于贝叶斯学习，以深入研究GNNS在节点分类任务中非线性的功能。鉴于从统计模型CSBM生成的图，我们观察到，给定其自身和邻居的属性的节点标签的最大形式估计由两种类型的非线性组成，可能是节点属性的非线性转换以及来自邻居的Relu激活特征聚合。后者令人惊讶地与许多GNN模型中使用的非线性类型相匹配。通过进一步对节点属性施加高斯假设，我们证明，当这些relu激活的优越性仅当节点属性比图形结构更具信息性时，它与许多以前的经验观察匹配得很好。当培训和测试数据集之间的节点属性分布变化时，可以实现类似的参数。最后，我们在合成网络和现实世界网络上验证我们的理论。

Graph neural networks (GNNs) have shown superiority in many prediction tasks over graphs due to their impressive capability of capturing nonlinear relations in graph-structured data. However, for node classification tasks, often, only marginal improvement of GNNs over their linear counterparts has been observed. Previous works provide very few understandings of this phenomenon. In this work, we resort to Bayesian learning to deeply investigate the functions of non-linearity in GNNs for node classification tasks. Given a graph generated from the statistical model CSBM, we observe that the max-a-posterior estimation of a node label given its own and neighbors' attributes consists of two types of non-linearity, a possibly non-linear transformation of node attributes and a ReLU-activated feature aggregation from neighbors. The latter surprisingly matches the type of non-linearity used in many GNN models. By further imposing Gaussian assumption on node attributes, we prove that the superiority of those ReLU activations is only significant when the node attributes are far more informative than the graph structure, which nicely matches many previous empirical observations. A similar argument can be achieved when there is a distribution shift of node attributes between the training and testing datasets. Finally, we verify our theory on both synthetic and real-world networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题