INT：使用有效框架迈向无限帧3D检测

论文标题

INT：使用有效框架迈向无限帧3D检测

INT: Towards Infinite-frames 3D Detection with An Efficient Framework

论文作者

Xu, Jianyun, Miao, Zhenwei, Zhang, Da, Pan, Hongyu, Liu, Kaixuan, Hao, Peihan, Zhu, Jun, Sun, Zhengyang, Li, Hongmin, Zhan, Xin

论文摘要

为连续时间流构建多帧而不是单帧3D检测器是很自然的。尽管增加的帧数可能会提高性能，但由于计算和内存成本急剧提高，以前的多帧研究仅使用非常有限的帧来构建其系统。为了解决这些问题，我们提出了一种新颖的流程训练和预测框架，从理论上讲，该框架可以使用无限数量的框架，同时保持与单帧检测器相同的计算。该无限框架（INT）可与大多数现有检测器一起使用，例如在流行的中心点上，具有显着的延迟减少和性能改进。我们还对两个大规模数据集（Nuscenes and Waymo Open Dataset）进行了广泛的实验，以证明该方案的有效性和效率。通过在CenterPoint上使用INT，我们可以以仅2〜4毫秒的延迟开销，大约7％（Waymo）和15％（Nuscenes）的性能提升，目前在Waymo 3D检测排行榜上SOTA。

It is natural to construct a multi-frame instead of a single-frame 3D detector for a continuous-time stream. Although increasing the number of frames might improve performance, previous multi-frame studies only used very limited frames to build their systems due to the dramatically increased computational and memory cost. To address these issues, we propose a novel on-stream training and prediction framework that, in theory, can employ an infinite number of frames while keeping the same amount of computation as a single-frame detector. This infinite framework (INT), which can be used with most existing detectors, is utilized, for example, on the popular CenterPoint, with significant latency reductions and performance improvements. We've also conducted extensive experiments on two large-scale datasets, nuScenes and Waymo Open Dataset, to demonstrate the scheme's effectiveness and efficiency. By employing INT on CenterPoint, we can get around 7% (Waymo) and 15% (nuScenes) performance boost with only 2~4ms latency overhead, and currently SOTA on the Waymo 3D Detection leaderboard.

下载PDF全文

下载文献需遵守相关版权规定

论文标题