从布局中属性引导的图像生成

论文标题

从布局中属性引导的图像生成

Attribute-guided image generation from layout

论文作者

Ma, Ke, Zhao, Bo, Sigal, Leonid

论文摘要

最近的方法从结构化输入（例如语义分割，场景图或布局）中获得了图像生成的巨大成功。尽管这些方法允许在图像级别的对象及其位置进行指定，但它们缺乏在实例级别上指定这些对象的视觉外观的忠诚度和语义控制。为了解决此限制，我们提出了一种新的图像生成方法，该方法可以启用实例级属性控件。具体而言，我们属性引导的生成模型的输入是一个包含：（1）对象边界框，（2）对象类别和（3）每个对象的属性集。输出是生成的图像，其中所请求的对象位于所需位置并已规定属性。几种损失可以协同起作用，以鼓励准确，一致和多样化的图像产生。视觉基因组数据集上的实验演示了我们模型在生成的图像中控制对象级属性的能力，并从布局任务中验证图像生成中删除的对象属性表示的合理性。同样，与先前的最新技术相比，我们模型中生成的图像具有更高的分辨率，对象分类精度和一致性。

Recent approaches have achieved great success in image generation from structured inputs, e.g., semantic segmentation, scene graph or layout. Although these methods allow specification of objects and their locations at image-level, they lack the fidelity and semantic control to specify visual appearance of these objects at an instance-level. To address this limitation, we propose a new image generation method that enables instance-level attribute control. Specifically, the input to our attribute-guided generative model is a tuple that contains: (1) object bounding boxes, (2) object categories and (3) an (optional) set of attributes for each object. The output is a generated image where the requested objects are in the desired locations and have prescribed attributes. Several losses work collaboratively to encourage accurate, consistent and diverse image generation. Experiments on Visual Genome dataset demonstrate our model's capacity to control object-level attributes in generated images, and validate plausibility of disentangled object-attribute representation in the image generation from layout task. Also, the generated images from our model have higher resolution, object classification accuracy and consistency, as compared to the previous state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题