论文标题
CityScapes 3D:9 DOF车辆检测的数据集和基准
Cityscapes 3D: Dataset and Benchmark for 9 DoF Vehicle Detection
论文作者
论文摘要
检测车辆并在三维空间中代表其位置和方向是自动驾驶的关键技术。最近,仅基于单眼RGB图像的3D车辆检测方法越来越受欢迎。为了促进此任务以及比较和推动最新方法,已经发布了一些新的数据集和基准。车辆的地面真相注释通常是使用LiDar Point云获得的,这通常会导致两个传感器之间的校准或同步引起的错误。为此,我们提出了CityScapes 3D,将原始的CityScapes数据集扩展了所有类型的车辆的3D边界框注释。与现有数据集相反,我们的3D注释仅使用立体声RGB图像标记,并捕获所有9个自由度。与基于激光雷达的方法相比,这导致RGB图像中的像素精度重新投影和更高的注释范围。为了简化多任务学习,我们提供了与3D边界框的2D实例段的配对。此外,我们还根据新的注释以及本工作中介绍的指标来补充CityScapes基准套件的3D车辆检测。数据集和基准可以在线获得。
Detecting vehicles and representing their position and orientation in the three dimensional space is a key technology for autonomous driving. Recently, methods for 3D vehicle detection solely based on monocular RGB images gained popularity. In order to facilitate this task as well as to compare and drive state-of-the-art methods, several new datasets and benchmarks have been published. Ground truth annotations of vehicles are usually obtained using lidar point clouds, which often induces errors due to imperfect calibration or synchronization between both sensors. To this end, we propose Cityscapes 3D, extending the original Cityscapes dataset with 3D bounding box annotations for all types of vehicles. In contrast to existing datasets, our 3D annotations were labeled using stereo RGB images only and capture all nine degrees of freedom. This leads to a pixel-accurate reprojection in the RGB image and a higher range of annotations compared to lidar-based approaches. In order to ease multitask learning, we provide a pairing of 2D instance segments with 3D bounding boxes. In addition, we complement the Cityscapes benchmark suite with 3D vehicle detection based on the new annotations as well as metrics presented in this work. Dataset and benchmark are available online.