生成视图综合：从单视语义到新视图图像

论文标题

生成视图综合：从单视语义到新视图图像

Generative View Synthesis: From Single-view Semantics to Novel-view Images

论文作者

Habtegebrial, Tewodros, Jampani, Varun, Gallo, Orazio, Stricker, Didier

论文摘要

内容创建是虚拟现实等应用程序的核心，这可能是一个乏味且耗时的。最近的图像综合方法通过提供工具来简化此任务，从而从单个输入图像或将语义映射转换为光真逼真的图像来生成新视图。我们建议进一步推动信封，并引入生成视图合成（GVS），该视图合成（GVS）可以综合给定一个语义图的场景的多个逼真的视图。我们表明，现有技术的顺序应用，例如语义到图像转换，然后是单眼视图合成，无法捕获场景的结构。相比之下，我们根据对场景的3D布局的估计来解决语义到图像的翻译，从而产生了保留语义结构的几何一致的小说。我们首先将输入2D语义图提升到特征空间中场景的3D分层表示，从而保留了3D几何结构的语义标签。然后，我们将分层特征投射到目标视图上，以生成最终的小说视图图像。我们验证方法的优势，并将其与三个不同数据集上的几个高级基线进行比较。我们的方法还允许样式的操作和图像编辑操作，例如添加或删除对象，分别对输入样式图像和语义图的简单操作。访问https://gvsnet.github.io的项目页面。

Content creation, central to applications such as virtual reality, can be a tedious and time-consuming. Recent image synthesis methods simplify this task by offering tools to generate new views from as little as a single input image, or by converting a semantic map into a photorealistic image. We propose to push the envelope further, and introduce Generative View Synthesis (GVS), which can synthesize multiple photorealistic views of a scene given a single semantic map. We show that the sequential application of existing techniques, e.g., semantics-to-image translation followed by monocular view synthesis, fail at capturing the scene's structure. In contrast, we solve the semantics-to-image translation in concert with the estimation of the 3D layout of the scene, thus producing geometrically consistent novel views that preserve semantic structures. We first lift the input 2D semantic map onto a 3D layered representation of the scene in feature space, thereby preserving the semantic labels of 3D geometric structures. We then project the layered features onto the target views to generate the final novel-view images. We verify the strengths of our method and compare it with several advanced baselines on three different datasets. Our approach also allows for style manipulation and image editing operations, such as the addition or removal of objects, with simple manipulations of the input style images and semantic maps respectively. Visit the project page at https://gvsnet.github.io.

下载PDF全文

下载文献需遵守相关版权规定

论文标题