论文标题
基于图像的导航的时空图形本地化网络
Spatio-Temporal Graph Localization Networks for Image-based Navigation
论文作者
论文摘要
拓扑图中的本地化对于使用RGB摄像机的基于图像的导航至关重要。在中等大小的环境中,仅使用一个摄像机进行本地化可能会具有挑战性,因为经常反复观察到相似的图像,尤其是在室内环境中。为了克服这个问题,我们提出了一种基于学习的本地化方法,该方法同时利用了拓扑图的空间一致性以及机器人捕获的时间序列图像的时间一致性。我们的方法将卷积神经网络(CNN)结合到嵌入图像特征和复发型图神经网络以执行准确的定位。在训练我们的模型时,在现实世界中捕获图像时,很难获得机器人的地面真相。因此,我们提出了一种SIM2REAL转移方法,并使用半监督的学习,除了真实图像外,还要以地面真相姿势利用模拟器图像。我们在定量和定性上评估了我们的方法,并将其与几个最先进的基线进行了比较。所提出的方法在地图包含相似图像的环境中优于基线。此外,我们评估了一种基于图像的导航系统,该系统结合了我们的本地化方法,并确认与其他基线方法相比,模拟器和实际环境中的导航精度显着提高。
Localization in topological maps is essential for image-based navigation using an RGB camera. Localization using only one camera can be challenging in medium-to-large-sized environments because similar-looking images are often observed repeatedly, especially in indoor environments. To overcome this issue, we propose a learning-based localization method that simultaneously utilizes the spatial consistency from topological maps and the temporal consistency from time-series images captured by the robot. Our method combines a convolutional neural network (CNN) to embed image features and a recurrent-type graph neural network to perform accurate localization. When training our model, it is difficult to obtain the ground truth pose of the robot when capturing images in real-world environments. Hence, we propose a sim2real transfer approach with semi-supervised learning that leverages simulator images with the ground truth pose in addition to real images. We evaluated our method quantitatively and qualitatively and compared it with several state-of-the-art baselines. The proposed method outperformed the baselines in environments where the map contained similar images. Moreover, we evaluated an image-based navigation system incorporating our localization method and confirmed that navigation accuracy significantly improved in the simulator and real environments when compared with the other baseline methods.