Safebench：自动驾驶汽车安全评估的基准测试平台

论文标题

Safebench：自动驾驶汽车安全评估的基准测试平台

SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles

论文作者

Xu, Chejian, Ding, Wenhao, Lyu, Weijie, Liu, Zuxin, Wang, Shuai, He, Yihan, Hu, Hanjiang, Zhao, Ding, Li, Bo

论文摘要

如最近的研究所示，支持机器智能的系统容易受到对抗性操纵或自然分配变化而产生的测试案例。这引起了人们对现实应用程序的部署机器学习算法的极大关注，尤其是在自动驾驶（AD）等安全性领域中。另一方面，由于自然主义场景的传统广告测试需要数亿英里，因为现实世界中安全至关重要的情况的稀有性和稀有性。结果，已经探索了几种自动驾驶评估方法，但是，这些方法通常是基于不同的仿真平台，安全性 - 关键的情况的类型，场景生成算法和驱动路线变化的方法。因此，尽管在自动驾驶测试方面进行了大量努力，但在相似条件下，比较和了解不同测试场景产生算法和测试机制的有效性和效率仍然具有挑战性。在本文中，我们旨在提供第一个统一的平台Safebench，以整合不同类型的安全性测试方案，场景生成算法以及其他变体，例如驾驶路线和环境。同时，我们实施了4种基于深入学习的AD算法，其中具有4种类型的输入（例如，鸟类视图，相机，相机），以对SafeBench进行公平的比较。我们发现我们的生成的测试场景确实更具挑战性，并观察到良性和关键安全测试方案下的广告代理的性能之间的权衡。我们认为，我们的统一平台安全台，用于大规模和有效的自动驾驶测试，将激发新的测试场景生成和安全广告算法的开发。 SafeBench可从https://safebench.github.io获得。

As shown by recent studies, machine intelligence-enabled systems are vulnerable to test cases resulting from either adversarial manipulation or natural distribution shifts. This has raised great concerns about deploying machine learning algorithms for real-world applications, especially in safety-critical domains such as autonomous driving (AD). On the other hand, traditional AD testing on naturalistic scenarios requires hundreds of millions of driving miles due to the high dimensionality and rareness of the safety-critical scenarios in the real world. As a result, several approaches for autonomous driving evaluation have been explored, which are usually, however, based on different simulation platforms, types of safety-critical scenarios, scenario generation algorithms, and driving route variations. Thus, despite a large amount of effort in autonomous driving testing, it is still challenging to compare and understand the effectiveness and efficiency of different testing scenario generation algorithms and testing mechanisms under similar conditions. In this paper, we aim to provide the first unified platform SafeBench to integrate different types of safety-critical testing scenarios, scenario generation algorithms, and other variations such as driving routes and environments. Meanwhile, we implement 4 deep reinforcement learning-based AD algorithms with 4 types of input (e.g., bird's-eye view, camera) to perform fair comparisons on SafeBench. We find our generated testing scenarios are indeed more challenging and observe the trade-off between the performance of AD agents under benign and safety-critical testing scenarios. We believe our unified platform SafeBench for large-scale and effective autonomous driving testing will motivate the development of new testing scenario generation and safe AD algorithms. SafeBench is available at https://safebench.github.io.

下载PDF全文

下载文献需遵守相关版权规定

论文标题