论文标题
使用时空网络依赖性的有效交通状态预测:稀疏的图形神经网络方法
Efficient Traffic State Forecasting using Spatio-Temporal Network Dependencies: A Sparse Graph Neural Network Approach
论文作者
论文摘要
运输网络中的交通状态预测对于有效的流量操作和管理以及知情的用户和系统级决策至关重要。但是,在当前研究中,长期的交通预测(未来30分钟之遥)仍然具有挑战性。在这项工作中,我们将网络建模的传输网络中的时空依赖性与图形卷积网络(GCN)和图形注意网络(GAT)集成在一起。为了进一步解决由多个级联层引起的巨大模型大小(即权重)引起的戏剧性计算和记忆成本,我们提出稀疏训练以减轻培训成本,同时保留预测准确性。这是一个使用固定数量的非零权重训练的过程。我们考虑了来自加利福尼亚运输部(CALTRANS)绩效测量系统(PEMS)的真正大规模运输网络数据的长期交通速度预测问题。实验结果表明,所提出的GCN-STGT和GAT-STGT模型在短期,中期和长期预测范围上的预测误差分别为15、30和45分钟的持续时间。使用我们的稀疏训练,我们可以通过高稀疏性(例如,高达90%)从头开始训练,相当于使用与密集训练相同的时期降低计算成本的浮点操作(FLOPS(FLOP)的10倍,并且与原始的密集训练相比,模型非常小
Traffic state prediction in a transportation network is paramount for effective traffic operations and management, as well as informed user and system-level decision-making. However, long-term traffic prediction (beyond 30 minutes into the future) remains challenging in current research. In this work, we integrate the spatio-temporal dependencies in the transportation network from network modeling, together with the graph convolutional network (GCN) and graph attention network (GAT). To further tackle the dramatic computation and memory cost caused by the giant model size (i.e., number of weights) caused by multiple cascaded layers, we propose sparse training to mitigate the training cost, while preserving the prediction accuracy. It is a process of training using a fixed number of nonzero weights in each layer in each iteration. We consider the problem of long-term traffic speed forecasting for a real large-scale transportation network data from the California Department of Transportation (Caltrans) Performance Measurement System (PeMS). Experimental results show that the proposed GCN-STGT and GAT-STGT models achieve low prediction errors on short-, mid- and long-term prediction horizons, of 15, 30 and 45 minutes in duration, respectively. Using our sparse training, we could train from scratch with high sparsity (e.g., up to 90%), equivalent to 10 times floating point operations per second (FLOPs) reduction on computational cost using the same epochs as dense training, and arrive at a model with very small accuracy loss compared with the original dense training