生成多层图像：制作2D GAN 3D感知

论文标题

生成多层图像：制作2D GAN 3D感知

Generative Multiplane Images: Making a 2D GAN 3D-Aware

论文作者

Zhao, Xiaoming, Ma, Fangchang, Güera, David, Ren, Zhile, Schwing, Alexander G., Colburn, Alex

论文摘要

真正需要什么才能使现有的2D GAN 3D了解？为了回答这个问题，我们会尽可能少地修改经典的gan，即styleganv2。我们发现只有两个修改是绝对必要的：1）多层图像样式生成器分支，该分支在其深度上产生一组Alpha Maps； 2）姿势条件的歧视者。我们将生成的输出称为“生成多层图像”（GMPI），并强调其效果不仅是高质量的，而且保证是符合视图的，这使得GMPI与许多先前的作品不同。重要的是，可以动态调整Alpha地图的数量，并且可以在训练和推理之间有所不同，减轻记忆问题，并在不到半天的时间内以$ 1024^2 $的分辨率在不到半天的时间内快速训练GMPI。我们的发现在三个具有挑战性和常见的高分辨率数据集（包括FFHQ，AFHQV2和METFACE）中是一致的。

What is really needed to make an existing 2D GAN 3D-aware? To answer this question, we modify a classical GAN, i.e., StyleGANv2, as little as possible. We find that only two modifications are absolutely necessary: 1) a multiplane image style generator branch which produces a set of alpha maps conditioned on their depth; 2) a pose-conditioned discriminator. We refer to the generated output as a 'generative multiplane image' (GMPI) and emphasize that its renderings are not only high-quality but also guaranteed to be view-consistent, which makes GMPIs different from many prior works. Importantly, the number of alpha maps can be dynamically adjusted and can differ between training and inference, alleviating memory concerns and enabling fast training of GMPIs in less than half a day at a resolution of $1024^2$. Our findings are consistent across three challenging and common high-resolution datasets, including FFHQ, AFHQv2, and MetFaces.

下载PDF全文

下载文献需遵守相关版权规定

论文标题