论文标题

成像编辑器和编辑台:进步和评估文本指导图像介入

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

论文作者

Wang, Su, Saharia, Chitwan, Montgomery, Ceslee, Pont-Tuset, Jordi, Noy, Shai, Pellegrini, Stefano, Onoe, Yasumasa, Laszlo, Sarah, Fleet, David J., Soricut, Radu, Baldridge, Jason, Norouzi, Mohammad, Anderson, Peter, Chan, William

论文摘要

文本指导的图像编辑可以在支持创意应用程序中产生变革性的影响。一个关键的挑战是生成忠于输入文本提示的编辑,同时与输入图像一致。我们介绍了成像编辑器,这是一种构建的级联扩散模型,通过在文本引导的图像介绍上进行微调成像。 Imagen编辑器的编辑忠实于文本提示,这是通过使用对象探测器在训练过程中提出涂上遮罩来完成的。此外,成像编辑器通过调节原始高分辨率图像上的级联管道来捕获输入图像中的细节。为了改善定性和定量评估,我们介绍了EditBench,这是用于文本指导图像介绍的系统基准。 EditBench评估自然图像和生成的图像的介入编辑,以探索对象,属性和场景。通过对编辑台式的广泛评估,我们发现训练过程中的对象掩盖会导致文本图像对齐的全面改进 - 因此,成像编辑器比dall-e 2和稳定的扩散优先 - 作为一个同类,这些模型在对象施用方面比文本订阅范围/尺寸/尺寸属于材料/尺寸要比计数/属性属性/属性属性属性更好。

Text-guided image editing can have a transformative impact in supporting creative applications. A key challenge is to generate edits that are faithful to input text prompts, while consistent with input images. We present Imagen Editor, a cascaded diffusion model built, by fine-tuning Imagen on text-guided image inpainting. Imagen Editor's edits are faithful to the text prompts, which is accomplished by using object detectors to propose inpainting masks during training. In addition, Imagen Editor captures fine details in the input image by conditioning the cascaded pipeline on the original high resolution image. To improve qualitative and quantitative evaluation, we introduce EditBench, a systematic benchmark for text-guided image inpainting. EditBench evaluates inpainting edits on natural and generated images exploring objects, attributes, and scenes. Through extensive human evaluation on EditBench, we find that object-masking during training leads to across-the-board improvements in text-image alignment -- such that Imagen Editor is preferred over DALL-E 2 and Stable Diffusion -- and, as a cohort, these models are better at object-rendering than text-rendering, and handle material/color/size attributes better than count/shape attributes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源