学习布局和样式可重新配置的剂量用于可控图像合成

论文标题

学习布局和样式可重新配置的剂量用于可控图像合成

Learning Layout and Style Reconfigurable GANs for Controllable Image Synthesis

论文作者

Sun, Wei, Wu, Tianfu

论文摘要

随着最近学习深层生成模型的显着进展，开发可从可重构输入的可控图像合成模型变得越来越有趣。本文重点介绍了最新出现的任务，即布局对图像，以学习能够从空间布局（即，在图像晶格中配置的对象边界框）和样式（即结构和外观变体编码的式对象边界框）中综合照片真实图像的生成模型。本文首先提出了一个直观的范式，以进行布局到蒙版对象，以学习在输入布局中展开给定边界框的对象掩模，以弥合输入布局和合成图像之间的间隙。然后，本文介绍了一种基于生成对抗网络的方法，用于在图像和掩码级别上具有样式控制的拟议的布局到面罩对图像。从输入布局中学到对象蒙版，并沿发电机网络的阶段进行迭代精制。图像级别的样式控制与Vanilla gan中的样式控制相同，而在对象掩码级别的样式控制是通过提出的新颖功能归一化方案（实例敏感和布局意识归一化的）实现的。在实验中，在可可-STUFF数据集和视觉基因组数据集中测试了所提出的方法，并获得了最先进的性能。

With the remarkable recent progress on learning deep generative models, it becomes increasingly interesting to develop models for controllable image synthesis from reconfigurable inputs. This paper focuses on a recent emerged task, layout-to-image, to learn generative models that are capable of synthesizing photo-realistic images from spatial layout (i.e., object bounding boxes configured in an image lattice) and style (i.e., structural and appearance variations encoded by latent vectors). This paper first proposes an intuitive paradigm for the task, layout-to-mask-to-image, to learn to unfold object masks of given bounding boxes in an input layout to bridge the gap between the input layout and synthesized images. Then, this paper presents a method built on Generative Adversarial Networks for the proposed layout-to-mask-to-image with style control at both image and mask levels. Object masks are learned from the input layout and iteratively refined along stages in the generator network. Style control at the image level is the same as in vanilla GANs, while style control at the object mask level is realized by a proposed novel feature normalization scheme, Instance-Sensitive and Layout-Aware Normalization. In experiments, the proposed method is tested in the COCO-Stuff dataset and the Visual Genome dataset with state-of-the-art performance obtained.

下载PDF全文

下载文献需遵守相关版权规定

论文标题