通过语义驱动的生成对抗网络与图表的学习

论文标题

通过语义驱动的生成对抗网络与图表的学习

Biphasic Face Photo-Sketch Synthesis via Semantic-Driven Generative Adversarial Network with Graph Representation Learning

论文作者

Qi, Xingqun, Sun, Muyi, Wang, Zijian, Liu, Jiaming, Li, Qi, Zhao, Fang, Zhang, Shanghang, Shan, Caifeng

论文摘要

双相脸照片佐剂合成在数字娱乐和执法等广泛领域具有显着的实践价值。以前的方法直接在全球视图中生成了照片，它们始终遭受草图和复杂的照片变化的低质量，从而导致不自然和低保真的结果。在本文中，我们提出了一个新颖的语义驱动生成对抗网络，以解决上述问题，并与图表学习合作。考虑到人的面孔具有独特的空间结构，我们首先将班级的语义布局注入发电机，以提供基于样式的空间信息，以供合成的面部照片和草图。此外，为了增强生成面部细节的真实性，我们通过在输入面上的语义解析图构造了两种类型的表示图，称为阶层内语义图（IASG）和类间结构图（IRSG）（IRSG）。具体而言，IASG有效地模拟了每个面部语义成分的类内部语义相关性，从而产生了现实的面部细节。为了保留生成的面孔更加结构协调，IRSG通过图表来学习每个面部分量之间的阶层间结构关系。为了进一步提高合成图像的感知质量，我们通过充分利用照片和草图之间的多层次特征一致性来提出双相交互式训练策略。广泛的实验表明，我们的方法的表现优于CUFS和CUFSF数据集上最先进的竞争对手。

Biphasic face photo-sketch synthesis has significant practical value in wide-ranging fields such as digital entertainment and law enforcement. Previous approaches directly generate the photo-sketch in a global view, they always suffer from the low quality of sketches and complex photo variations, leading to unnatural and low-fidelity results. In this paper, we propose a novel Semantic-Driven Generative Adversarial Network to address the above issues, cooperating with Graph Representation Learning. Considering that human faces have distinct spatial structures, we first inject class-wise semantic layouts into the generator to provide style-based spatial information for synthesized face photos and sketches. Additionally, to enhance the authenticity of details in generated faces, we construct two types of representational graphs via semantic parsing maps upon input faces, dubbed the IntrA-class Semantic Graph (IASG) and the InteR-class Structure Graph (IRSG). Specifically, the IASG effectively models the intra-class semantic correlations of each facial semantic component, thus producing realistic facial details. To preserve the generated faces being more structure-coordinated, the IRSG models inter-class structural relations among every facial component by graph representation learning. To further enhance the perceptual quality of synthesized images, we present a biphasic interactive cycle training strategy by fully taking advantage of the multi-level feature consistency between the photo and sketch. Extensive experiments demonstrate that our method outperforms the state-of-the-art competitors on the CUFS and CUFSF datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题