论文标题
隐式激光雷达网络:通过插值重量预测的LIDAR超分辨率
Implicit LiDAR Network: LiDAR Super-Resolution via Interpolation Weight Prediction
论文作者
论文摘要
LIDAR范围图像的超分辨率对于改善许多下游任务,例如对象检测,识别和跟踪至关重要。尽管深度学习在超分辨率技术方面取得了显着的进步,但典型的卷积体系结构将缩放因素限制为训练中特定的输出分辨率。最近的工作表明,对图像的连续表示并学习其隐式函数几乎可以无限地进行扫描。但是,详细的方法,预测输入中邻居像素的值(深度),然后线性插值,这并不最适合LIDAR范围的图像,因为它不能填充未衡量的细节,而是在高维空间中创建带有回归的新图像。另外,线性插值模糊的尖锐边缘在3-D点中提供对象的重要边界信息。为了解决这些问题,我们提出了一个新颖的网络,隐式激光雷达网络(ILN),该网络不是每个像素的值,而是在插值中的权重,以便可以通过混合输入像素深度,而是具有非线性权重来完成分辨率。同样,可以将权重视为从查询到邻居像素的关注,因此可以利用最近的变压器体系结构中的注意模块。我们使用一种新型的大型合成数据集进行的实验表明,所提出的网络比最新方法更准确地重建,从而在培训中获得了更快的收敛速度。
Super-resolution of LiDAR range images is crucial to improving many downstream tasks such as object detection, recognition, and tracking. While deep learning has made a remarkable advances in super-resolution techniques, typical convolutional architectures limit upscaling factors to specific output resolutions in training. Recent work has shown that a continuous representation of an image and learning its implicit function enable almost limitless upscaling. However, the detailed approach, predicting values (depths) for neighbor pixels in the input and then linearly interpolating them, does not best fit the LiDAR range images since it does not fill the unmeasured details but creates a new image with regression in a high-dimensional space. In addition, the linear interpolation blurs sharp edges providing important boundary information of objects in 3-D points. To handle these problems, we propose a novel network, Implicit LiDAR Network (ILN), which learns not the values per pixels but weights in the interpolation so that the superresolution can be done by blending the input pixel depths but with non-linear weights. Also, the weights can be considered as attentions from the query to the neighbor pixels, and thus an attention module in the recent Transformer architecture can be leveraged. Our experiments with a novel large-scale synthetic dataset demonstrate that the proposed network reconstructs more accurately than the state-of-the-art methods, achieving much faster convergence in training.