通过单眼视频重建个性化语义面部NERF模型

论文标题

通过单眼视频重建个性化语义面部NERF模型

Reconstructing Personalized Semantic Facial NeRF Models From Monocular Video

论文作者

Gao, Xuan, Zhong, Chenglai, Xiang, Jun, Hong, Yang, Guo, Yudong, Zhang, Juyong

论文摘要

我们提出了一种用神经辐射场定义的人头的新型语义模型。 3D一致的头模型由一组解释和可解释的碱基组成，并且可以由低维表达系数驱动。由于神经辐射场的强大表示能力，构造的模型可以代表复杂的面部属性，包括头发，磨损，无法用传统的网格混合形状来表示。为了构建个性化的语义面部模型，我们建议将基础定义为几个多级体素字段。通过简短的单眼RGB视频作为输入，我们的方法只能使用十到二十分钟来构建受试者的语义面部NERF模型，并且可以在数十个miriseconds中呈现带有给定的表达系数和视图方向的光真实的人头图像。通过这种新颖的表示，我们将其应用于许多任务，例如面部重新定位和表达编辑。实验结果表明其强大的表示能力和训练/推理速度。演示视频和发布的代码在我们的项目页面中提供：https：//ustc3dv.github.io/nerfblendshape/

We present a novel semantic model for human head defined with neural radiance field. The 3D-consistent head model consist of a set of disentangled and interpretable bases, and can be driven by low-dimensional expression coefficients. Thanks to the powerful representation ability of neural radiance field, the constructed model can represent complex facial attributes including hair, wearings, which can not be represented by traditional mesh blendshape. To construct the personalized semantic facial model, we propose to define the bases as several multi-level voxel fields. With a short monocular RGB video as input, our method can construct the subject's semantic facial NeRF model with only ten to twenty minutes, and can render a photo-realistic human head image in tens of miliseconds with a given expression coefficient and view direction. With this novel representation, we apply it to many tasks like facial retargeting and expression editing. Experimental results demonstrate its strong representation ability and training/inference speed. Demo videos and released code are provided in our project page: https://ustc3dv.github.io/NeRFBlendShape/

下载PDF全文

下载文献需遵守相关版权规定

论文标题