无监督的单眼深度学习，具有综合的内在和时空约束

论文标题

无监督的单眼深度学习，具有综合的内在和时空约束

Unsupervised Monocular Depth Learning with Integrated Intrinsics and Spatio-Temporal Constraints

论文作者

Chen, Kenny, Pogue, Alexandra, Lopez, Brett T., Agha-mohammadi, Ali-akbar, Mehta, Ankur

论文摘要

近年来，单眼深度推理引起了研究人员的极大关注，并且仍然是昂贵的飞行时间传感器的有前途的替代者，但是规模获取和实施间接费用的问题仍然困扰着这些系统。为此，这项工作提出了一个无监督的学习框架，该框架能够通过一系列单眼图像通过单个网络来预测尺度的深度图和自我观念。我们的方法纳入了空间和时间几何约束，以解决深度和姿势量表因子，这些因素在训练时在监督重建损失函数中实施。与以前的方法相比，仅需要未标记的立体声序列才能训练我们的单网络体系结构的权重，这与以前的方法相比降低了整体实现开销。与Kitti驱动数据集多个序列上的最新序列相比，我们的结果表明性能很强，并且可以减少其网络复杂性，以提供更快的培训时间。

Monocular depth inference has gained tremendous attention from researchers in recent years and remains as a promising replacement for expensive time-of-flight sensors, but issues with scale acquisition and implementation overhead still plague these systems. To this end, this work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion, in addition to camera intrinsics, from a sequence of monocular images via a single network. Our method incorporates both spatial and temporal geometric constraints to resolve depth and pose scale factors, which are enforced within the supervisory reconstruction loss functions at training time. Only unlabeled stereo sequences are required for training the weights of our single-network architecture, which reduces overall implementation overhead as compared to previous methods. Our results demonstrate strong performance when compared to the current state-of-the-art on multiple sequences of the KITTI driving dataset and can provide faster training times with its reduced network complexity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题