神经像素组成：来自多视图的3D-4D视图合成

论文标题

神经像素组成：来自多视图的3D-4D视图合成

Neural Pixel Composition: 3D-4D View Synthesis from Multi-Views

论文作者

Bansal, Aayush, Zollhoefer, Michael

论文摘要

我们提出了神经像素组成（NPC），这是一种连续3D-4D视图合成的新方法，只有一组离散的多视图观测值作为输入。现有的最新方法需要密集的多视图监督和广泛的计算预算。所提出的配方可靠地在稀疏和宽基线的多视图图像上运行，并且可以在几秒钟至10分钟内进行高分辨率（12MP）含量的培训，即比现有方法更快地收敛200-400x。对我们的方法至关重要的是两个核心新颖性：1）像素的表示，其中包含从视线沿特定位置和时间的多视图中积累的颜色和深度信息，以及2）一个多层perceptron（MLP），该多层perceptron（MLP）启用了为Pixel位置提供的丰富信息的组成，以获得最终的颜色输出。我们尝试各种多视图序列，与现有方法相比，并在各种和挑战性的环境中获得更好的结果。最后，我们的方法从稀疏的多视图中实现了密集的3D重建，其中Colmap是最先进的3D重建方法，挣扎。

We present Neural Pixel Composition (NPC), a novel approach for continuous 3D-4D view synthesis given only a discrete set of multi-view observations as input. Existing state-of-the-art approaches require dense multi-view supervision and an extensive computational budget. The proposed formulation reliably operates on sparse and wide-baseline multi-view imagery and can be trained efficiently within a few seconds to 10 minutes for hi-res (12MP) content, i.e., 200-400X faster convergence than existing methods. Crucial to our approach are two core novelties: 1) a representation of a pixel that contains color and depth information accumulated from multi-views for a particular location and time along a line of sight, and 2) a multi-layer perceptron (MLP) that enables the composition of this rich information provided for a pixel location to obtain the final color output. We experiment with a large variety of multi-view sequences, compare to existing approaches, and achieve better results in diverse and challenging settings. Finally, our approach enables dense 3D reconstruction from sparse multi-views, where COLMAP, a state-of-the-art 3D reconstruction approach, struggles.

下载PDF全文

下载文献需遵守相关版权规定

论文标题