论文标题
HDGT:通过场景编码的多代理轨迹预测的异质驱动图形变压器
HDGT: Heterogeneous Driving Graph Transformer for Multi-Agent Trajectory Prediction via Scene Encoding
论文作者
论文摘要
将驾驶场景编码为向量表示,这是自动驾驶的重要任务,可以使下游任务受益,例如轨迹预测。驾驶场景通常涉及异构元素,例如不同类型的物体(代理,车道,交通标志),并且对象之间的语义关系丰富而多样。同时,跨元素也存在相对论,这意味着空间关系是相对概念,需要以自我为中心的方式而不是在全球坐标系中进行编码。基于这些观察结果,我们提出了异质驾驶图形变压器(HDGT),这是一种骨干,将驾驶场景建模为具有不同类型的节点和边缘的异质图。对于异质图构造,我们根据各种语义关系连接不同类型的节点。对于空间关系编码,节点及其内部的坐标在局部以节点为中心的坐标系中。对于图形神经网络(GNN)中的聚合模块,我们以层次方式采用变压器结构来符合输入的异质性质。实验结果表明,HDGT在相互作用预测挑战和Waymo开放运动挑战方面实现了轨迹预测任务的最新性能。
Encoding a driving scene into vector representations has been an essential task for autonomous driving that can benefit downstream tasks e.g. trajectory prediction. The driving scene often involves heterogeneous elements such as the different types of objects (agents, lanes, traffic signs) and the semantic relations between objects are rich and diverse. Meanwhile, there also exist relativity across elements, which means that the spatial relation is a relative concept and need be encoded in a ego-centric manner instead of in a global coordinate system. Based on these observations, we propose Heterogeneous Driving Graph Transformer (HDGT), a backbone modelling the driving scene as a heterogeneous graph with different types of nodes and edges. For heterogeneous graph construction, we connect different types of nodes according to diverse semantic relations. For spatial relation encoding, the coordinates of the node as well as its in-edges are in the local node-centric coordinate system. For the aggregation module in the graph neural network (GNN), we adopt the transformer structure in a hierarchical way to fit the heterogeneous nature of inputs. Experimental results show that HDGT achieves state-of-the-art performance for the task of trajectory prediction, on INTERACTION Prediction Challenge and Waymo Open Motion Challenge.