论文标题

具有零均值互相关的嵌入式GPU上有效的立体声匹配

Efficient stereo matching on embedded GPUs with zero-means cross correlation

论文作者

Chang, Qiong, Zha, Aolong, Wang, Weimin, Liu, Xin, Onishi, Masaki, Lei, Lei, Er, Meng Joo, Maruyama, Tsutomu

论文摘要

移动立体声匹配系统已成为许多应用程序的重要组成部分,例如自动驾驶车辆和自动驾驶机器人。准确的立体匹配方法通常会导致较高的计算复杂性。但是,移动平台只有有限的硬件资源来保持其功耗较低。这使得很难在移动平台上保持可接受的处理速度和准确性。为了解决这一权衡,我们在这里提出了一种新的加速方法,用于众所周知的零均值跨相关性(ZNCC)匹配成本计算算法上的jetson TX2嵌入式GPU。在我们加速ZNCC的方法中,以锯齿形的方式扫描目标图像,以有效地重用一个像素对其相邻像素的计算;这减少了数据传输量并增加了芯片寄存器的利用,从而提高了处理速度。结果,我们的方法比传统的图像扫描方法快2倍,比最新的NCC方法快26%。通过将该技术与域转换(DT)算法相结合,我们的系统显示了32 fps的实时处理速度,在Jetson TX2 GPU上以1,280x384的像素图像为1,280x384,最大值的差异为128。此外,与Kitti 2015 Benchmarm的评估结果相同,与KITTI 2015 Benchmarm相同的al Alggens均与Algins的组合相同。同时保持几乎相同的处理速度。

Mobile stereo-matching systems have become an important part of many applications, such as automated-driving vehicles and autonomous robots. Accurate stereo-matching methods usually lead to high computational complexity; however, mobile platforms have only limited hardware resources to keep their power consumption low; this makes it difficult to maintain both an acceptable processing speed and accuracy on mobile platforms. To resolve this trade-off, we herein propose a novel acceleration approach for the well-known zero-means normalized cross correlation (ZNCC) matching cost calculation algorithm on a Jetson Tx2 embedded GPU. In our method for accelerating ZNCC, target images are scanned in a zigzag fashion to efficiently reuse one pixel's computation for its neighboring pixels; this reduces the amount of data transmission and increases the utilization of on-chip registers, thus increasing the processing speed. As a result, our method is 2X faster than the traditional image scanning method, and 26% faster than the latest NCC method. By combining this technique with the domain transformation (DT) algorithm, our system show real-time processing speed of 32 fps, on a Jetson Tx2 GPU for 1,280x384 pixel images with a maximum disparity of 128. Additionally, the evaluation results on the KITTI 2015 benchmark show that our combined system is more accurate than the same algorithm combined with census by 7.26%, while maintaining almost the same processing speed.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源