论文标题
Scara:具有面向特征优化的可扩展图神经网络
SCARA: Scalable Graph Neural Networks with Feature-Oriented Optimization
论文作者
论文摘要
数据处理的最新进展刺激了对非常大尺度的学习图的需求。众所周知,图形神经网络(GNN)是解决图形学习任务的一种新兴和有力的方法,很难扩大规模。大多数可扩展模型都采用基于节点的技术来简化GNN的昂贵图形消息传播过程。但是,当应用于百万甚至数十亿尺度的图表时,我们发现这种加速度不足。在这项工作中,我们提出了Scara,这是一种可扩展的GNN,具有针对特征的优化用于图计算。 Scara有效地计算出从节点功能中嵌入的图形,并进一步选择和重用功能计算结果以减少开销。理论分析表明,我们的模型在传播过程以及GNN训练和推理中具有确保精度可实现子线性时间的复杂性。我们在各种数据集上进行了广泛的实验,以评估圣aca的功效和效率。与基准的性能比较表明,与快速收敛且准确性相当的当前最新方法相比,Scara可以达到100倍的图形传播加速度。最值得注意的是,在100秒内处理最大的十亿个GNN数据集纸100m(1.11亿节点,1.6B边缘)上的预先计算是有效的。
Recent advances in data processing have stimulated the demand for learning graphs of very large scales. Graph Neural Networks (GNNs), being an emerging and powerful approach in solving graph learning tasks, are known to be difficult to scale up. Most scalable models apply node-based techniques in simplifying the expensive graph message-passing propagation procedure of GNN. However, we find such acceleration insufficient when applied to million- or even billion-scale graphs. In this work, we propose SCARA, a scalable GNN with feature-oriented optimization for graph computation. SCARA efficiently computes graph embedding from node features, and further selects and reuses feature computation results to reduce overhead. Theoretical analysis indicates that our model achieves sub-linear time complexity with a guaranteed precision in propagation process as well as GNN training and inference. We conduct extensive experiments on various datasets to evaluate the efficacy and efficiency of SCARA. Performance comparison with baselines shows that SCARA can reach up to 100x graph propagation acceleration than current state-of-the-art methods with fast convergence and comparable accuracy. Most notably, it is efficient to process precomputation on the largest available billion-scale GNN dataset Papers100M (111M nodes, 1.6B edges) in 100 seconds.