PV3D：肖像视频生成的3D生成模型

论文标题

PV3D：肖像视频生成的3D生成模型

PV3D: A 3D Generative Model for Portrait Video Generation

论文作者

Xu, Zhongcong, Zhang, Jianfeng, Liew, Jun Hao, Zhang, Wenqing, Bai, Song, Feng, Jiashi, Shou, Mike Zheng

论文摘要

生成对抗网络（GAN）的最新进展证明了产生令人惊叹的照片真实肖像图像的能力。尽管一些先前的作品已将这种图像应用于无条件的2D肖像视频生成和静态3D肖像合成，但很少有成功的作品成功地扩展了生成3D感知的肖像视频的甘种。在这项工作中，我们提出了PV3D，这是可以合成多视图一致的肖像视频的第一个生成框架。具体而言，我们的方法通过概括3D隐式神经表示来对时空空间进行建模，将最近的静态3D感知图像gan扩展到视频域。为了将运动动力学引入生成过程，我们通过堆叠多个运动层来开发运动发生器，以通过调制卷积生成运动特征。为了减轻相机/人类动作引起的运动歧义，我们为PV3D提出了一种简单而有效的相机状况策略，从而使时间和多视图一致的视频生成。此外，PV3D介绍了两个歧视器，以正规化空间和时间域，以确保生成的肖像视频的合理性。这些精心设计的设计使PV3D能够生成具有高质量外观和几何形状的3D感知运动性肖像视频，从而大大优于先前的作品。结果，PV3D能够支持许多下游应用程序，例如动画静态肖像和视图一致的视频运动编辑。代码和模型在https://showlab.github.io/pv3d上发布。

Recent advances in generative adversarial networks (GANs) have demonstrated the capabilities of generating stunning photo-realistic portrait images. While some prior works have applied such image GANs to unconditional 2D portrait video generation and static 3D portrait synthesis, there are few works successfully extending GANs for generating 3D-aware portrait videos. In this work, we propose PV3D, the first generative framework that can synthesize multi-view consistent portrait videos. Specifically, our method extends the recent static 3D-aware image GAN to the video domain by generalizing the 3D implicit neural representation to model the spatio-temporal space. To introduce motion dynamics to the generation process, we develop a motion generator by stacking multiple motion layers to generate motion features via modulated convolution. To alleviate motion ambiguities caused by camera/human motions, we propose a simple yet effective camera condition strategy for PV3D, enabling both temporal and multi-view consistent video generation. Moreover, PV3D introduces two discriminators for regularizing the spatial and temporal domains to ensure the plausibility of the generated portrait videos. These elaborated designs enable PV3D to generate 3D-aware motion-plausible portrait videos with high-quality appearance and geometry, significantly outperforming prior works. As a result, PV3D is able to support many downstream applications such as animating static portraits and view-consistent video motion editing. Code and models are released at https://showlab.github.io/pv3d.

下载PDF全文

下载文献需遵守相关版权规定

论文标题