野外生成的面孔：稳定扩散，Midjourney和Dall-E 2的定量比较

论文标题

野外生成的面孔：稳定扩散，Midjourney和Dall-E 2的定量比较

Generated Faces in the Wild: Quantitative Comparison of Stable Diffusion, Midjourney and DALL-E 2

论文作者

Borji, Ali

论文摘要

在过去的几年中，图像合成领域取得了长足的进步。最近的模型能够生成具有惊人质量的图像。在某些有趣类别（例如面孔）上对这些模型的细粒度评估仍然缺失。在这里，我们对三个流行系统进行了定量比较，包括稳定扩散，Midjourney和Dall-E 2在野外产生逼真的面孔的能力。根据FID分数，我们发现稳定的扩散比其他系统产生更好的面孔。我们还介绍了一个被称为GFW的野生面孔的数据集，其中包括15,076个面孔。此外，我们希望我们的研究在评估生成模型并改进它们时阻碍了后续研究。数据和代码分别可在数据和代码中获得。

The field of image synthesis has made great strides in the last couple of years. Recent models are capable of generating images with astonishing quality. Fine-grained evaluation of these models on some interesting categories such as faces is still missing. Here, we conduct a quantitative comparison of three popular systems including Stable Diffusion, Midjourney, and DALL-E 2 in their ability to generate photorealistic faces in the wild. We find that Stable Diffusion generates better faces than the other systems, according to the FID score. We also introduce a dataset of generated faces in the wild dubbed GFW, including a total of 15,076 faces. Furthermore, we hope that our study spurs follow-up research in assessing the generative models and improving them. Data and code are available at data and code, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题