论文标题
来自多视图室内捕获的深度场景尺度材料估算
Deep scene-scale material estimation from multi-view indoor captures
论文作者
论文摘要
电影和视频游戏行业采用了摄影测量法作为从真实世界的多张照片中创建数字3D资产的一种方式。但是,摄影测量算法通常会输出场景的RGB纹理图集,该图集仅作为熟练艺术家创建适合基于物理渲染的材料地图的视觉指导。我们提出了一种基于学习的方法,该方法通过从室内场景的多视图捕获中估算近似材料图来自动生产准备基于物理的渲染的数字资产,这些图与Retopologication Meetry一起使用。我们将方法基于在每个输入图像上执行的物质估计卷积神经网络(CNN)。我们通过收集给定图像的每个像素,其他图像中相应点的颜色来利用场景多个观察结果提供的视觉提示。这个图像空间CNN为我们提供了一系列预测,我们将其作为方法的最后一步合并为纹理空间。我们的结果表明,从任何观点和新颖的照明中,回收的资产可直接用于基于物理的渲染和编辑。与最接近先前的解决方案相比,我们的方法在一小部分时间内生成近似的材料图。
The movie and video game industries have adopted photogrammetry as a way to create digital 3D assets from multiple photographs of a real-world scene. But photogrammetry algorithms typically output an RGB texture atlas of the scene that only serves as visual guidance for skilled artists to create material maps suitable for physically-based rendering. We present a learning-based approach that automatically produces digital assets ready for physically-based rendering, by estimating approximate material maps from multi-view captures of indoor scenes that are used with retopologized geometry. We base our approach on a material estimation Convolutional Neural Network (CNN) that we execute on each input image. We leverage the view-dependent visual cues provided by the multiple observations of the scene by gathering, for each pixel of a given image, the color of the corresponding point in other images. This image-space CNN provides us with an ensemble of predictions, which we merge in texture space as the last step of our approach. Our results demonstrate that the recovered assets can be directly used for physically-based rendering and editing of real indoor scenes from any viewpoint and novel lighting. Our method generates approximate material maps in a fraction of time compared to the closest previous solutions.