论文标题
IV-SLAM:同时定位和映射的内省愿景
IV-SLAM: Introspective Vision for Simultaneous Localization and Mapping
论文作者
论文摘要
现有的视觉同时定位和映射的解决方案(V-SLAM)假设特征提取和匹配中的错误是独立的且分布相同的(i.i.d),但是已知该假设不是真实的 - 从图像的低对比度区域提取的特征表现出比尖锐角度的更大误差分布。此外,当感应图像包括诸如镜面反射,镜头耀斑或动态物体阴影之类的挑战性条件时,V-SLAM算法容易出现灾难性跟踪故障。为了解决此类失败,以前的工作重点是构建更强大的视觉前端,以滤除具有挑战性的功能。在本文中,我们提出了SLAM(IV-SLAM)的内省愿景,这是解决这些挑战的根本不同的方法。 IV-SLAM明确对视觉特征的重新投入误差的噪声过程对上下文依赖性进行了建模,因此是非i.i.d的。我们引入了一种自主监督的IV-SLAM方法,以收集培训数据,以学习这种背景感知的噪声模型。 IV-SLAM指南使用这种学习的噪声模型,以提取可能导致噪声较低的部分中选择更多特征,并将学习的噪声模型进一步纳入关节最大似然估计中,从而使其对上述错误类型的错误使其强大。我们提出了经验结果,以证明IV-SLAM 1)能够准确预测输入图像中的误差源,2)与V-SLAM相比,减少跟踪误差,3)与V-Slam相比,在挑战性的真实机器人数据上,跟踪失败之间的平均距离增加了70%以上。
Existing solutions to visual simultaneous localization and mapping (V-SLAM) assume that errors in feature extraction and matching are independent and identically distributed (i.i.d), but this assumption is known to not be true -- features extracted from low-contrast regions of images exhibit wider error distributions than features from sharp corners. Furthermore, V-SLAM algorithms are prone to catastrophic tracking failures when sensed images include challenging conditions such as specular reflections, lens flare, or shadows of dynamic objects. To address such failures, previous work has focused on building more robust visual frontends, to filter out challenging features. In this paper, we present introspective vision for SLAM (IV-SLAM), a fundamentally different approach for addressing these challenges. IV-SLAM explicitly models the noise process of reprojection errors from visual features to be context-dependent, and hence non-i.i.d. We introduce an autonomously supervised approach for IV-SLAM to collect training data to learn such a context-aware noise model. Using this learned noise model, IV-SLAM guides feature extraction to select more features from parts of the image that are likely to result in lower noise, and further incorporate the learned noise model into the joint maximum likelihood estimation, thus making it robust to the aforementioned types of errors. We present empirical results to demonstrate that IV-SLAM 1) is able to accurately predict sources of error in input images, 2) reduces tracking error compared to V-SLAM, and 3) increases the mean distance between tracking failures by more than 70% on challenging real robot data compared to V-SLAM.