论文标题
使用分层卷积的层次深神经网络的遵循领导者策略
A Follow-the-Leader Strategy using Hierarchical Deep Neural Networks with Grouped Convolutions
论文作者
论文摘要
使用分层深神经网络(DNN)端到端驾驶模型实现以下领导者的任务,以匹配目标行人的方向和速度。该模型使用分类器DNN来确定行人是否在相机传感器的视野内。如果存在行人,则将相机的图像流馈送到回归DNN,同时调整自动驾驶汽车的转向和油门,以使行人保持稳定。如果行人不可见,则使用直接的探索性搜索策略来重新调整跟踪目标。分类器和回归DNN结合了分组的卷积,以提高模型性能,并显着降低参数计数和计算延迟。这些模型在智能处理单元(IPU)上进行了培训,以利用其细粒度计算功能,以最大程度地减少训练时间。结果表明,自动驾驶汽车的转向和油门轮廓方面,自动驾驶汽车的行为非常强大,同时需要最少的数据收集。通过使用IPU与分组的卷积相结合,通过〜3.5的分组训练来提高处理训练样本的吞吐量,用于分类器的训练,而回归网络的量子则构成了约7。已经生产了跟踪行人的车辆的录音,并在网络上可用。这是SN计算机科学上发表的一篇文章的预印本。最终身份验证的版本可在线获得:https://doi.org/https://doi.org/10.1007/s42979-021-00572-1。
The task of following-the-leader is implemented using a hierarchical Deep Neural Network (DNN) end-to-end driving model to match the direction and speed of a target pedestrian. The model uses a classifier DNN to determine if the pedestrian is within the field of view of the camera sensor. If the pedestrian is present, the image stream from the camera is fed to a regression DNN which simultaneously adjusts the autonomous vehicle's steering and throttle to keep cadence with the pedestrian. If the pedestrian is not visible, the vehicle uses a straightforward exploratory search strategy to reacquire the tracking objective. The classifier and regression DNNs incorporate grouped convolutions to boost model performance as well as to significantly reduce parameter count and compute latency. The models are trained on the Intelligence Processing Unit (IPU) to leverage its fine-grain compute capabilities in order to minimize time-to-train. The results indicate very robust tracking behavior on the part of the autonomous vehicle in terms of its steering and throttle profiles, while requiring minimal data collection to produce. The throughput in terms of processing training samples has been boosted by the use of the IPU in conjunction with grouped convolutions by a factor ~3.5 for training of the classifier and a factor of ~7 for the regression network. A recording of the vehicle tracking a pedestrian has been produced and is available on the web. This is a preprint of an article published in SN Computer Science. The final authenticated version is available online at: https://doi.org/https://doi.org/10.1007/s42979-021-00572-1.