论文标题
三角洲:通过学习三角剖分和稀疏点致密度的深度估计
DELTAS: Depth Estimation by Learning Triangulation And densification of Sparse points
论文作者
论文摘要
多视图立体声(MVS)是活动深度感应的准确性与单眼深度估计的实用性之间的黄金平均值。采用3D卷积神经网络(CNN)的基于成本量的方法已大大提高了MVS系统的准确性。但是,这种准确性的计算成本高,这阻碍了实际采用。与成本量方法不同,我们通过(a)检测和评估描述符的兴趣点提出了有效的深度估计方法,然后(b)学习匹配和三角测量了一小部分兴趣点,最后(c)使用CNNS来密集这一稀疏的3D点。端到端网络有效地在深度学习框架内执行所有三个步骤,并通过中间2D图像和3D几何监督以及深度监督进行培训。至关重要的是,我们的第一步使用兴趣点检测和描述符学习来补充姿势估计。我们在深度估计中证明了最新的结果,并且对于不同场景长度的计算较低。此外,我们的方法概括为较新的环境,而我们的网络输出的描述源与强基础相比有利。代码可从https://github.com/magicleap/deltas获得
Multi-view stereo (MVS) is the golden mean between the accuracy of active depth sensing and the practicality of monocular depth estimation. Cost volume based approaches employing 3D convolutional neural networks (CNNs) have considerably improved the accuracy of MVS systems. However, this accuracy comes at a high computational cost which impedes practical adoption. Distinct from cost volume approaches, we propose an efficient depth estimation approach by first (a) detecting and evaluating descriptors for interest points, then (b) learning to match and triangulate a small set of interest points, and finally (c) densifying this sparse set of 3D points using CNNs. An end-to-end network efficiently performs all three steps within a deep learning framework and trained with intermediate 2D image and 3D geometric supervision, along with depth supervision. Crucially, our first step complements pose estimation using interest point detection and descriptor learning. We demonstrate state-of-the-art results on depth estimation with lower compute for different scene lengths. Furthermore, our method generalizes to newer environments and the descriptors output by our network compare favorably to strong baselines. Code is available at https://github.com/magicleap/DELTAS