关于用于有效培训和端到端驾驶模型验证的数据的选择

论文标题

关于用于有效培训和端到端驾驶模型验证的数据的选择

On the Choice of Data for Efficient Training and Validation of End-to-End Driving Models

论文作者

Klingner, Marvin, Müller, Konstantin, Mirzaie, Mona, Breitenstein, Jasmin, Termöhlen, Jan-Aike, Fingscheidt, Tim

论文摘要

数据驱动的机器学习（ML）的出现促进了许多复杂任务（例如高度自动驾驶）的重大进展。尽管在此类应用程序中付出了很多努力来改善ML模型和学习算法，但很少将重点放在如何设计培训数据和/或验证设置上。在本文中，我们调查了几种数据设计选择对以端到端方式训练的深层驾驶模型的培训和验证的影响。具体而言，（i）我们研究培训数据的量如何影响最终的驾驶性能，以及通过当前使用的机制生成培训数据引起的绩效限制。（ii）此外，我们通过相关分析显示，验证设计可以使验证过程中测量的驾驶性能很好地概括为未知的测试环境。（iii）最后，我们研究了随机播种和非确定性的效果，提供了报道改善的见解，可以看作是显着的。我们使用流行的Carla模拟器进行评估提供了有关数据生成和驾驶路线选择的建议，以有效地开发端到端驾驶模型。

The emergence of data-driven machine learning (ML) has facilitated significant progress in many complicated tasks such as highly-automated driving. While much effort is put into improving the ML models and learning algorithms in such applications, little focus is put into how the training data and/or validation setting should be designed. In this paper we investigate the influence of several data design choices regarding training and validation of deep driving models trainable in an end-to-end fashion. Specifically, (i) we investigate how the amount of training data influences the final driving performance, and which performance limitations are induced through currently used mechanisms to generate training data. (ii) Further, we show by correlation analysis, which validation design enables the driving performance measured during validation to generalize well to unknown test environments. (iii) Finally, we investigate the effect of random seeding and non-determinism, giving insights which reported improvements can be deemed significant. Our evaluations using the popular CARLA simulator provide recommendations regarding data generation and driving route selection for an efficient future development of end-to-end driving models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题