论文标题
无限索引:生成文本对图像模型的信息检索
The Infinite Index: Information Retrieval on Generative Text-To-Image Models
论文作者
论文摘要
诸如DALL-E和稳定扩散之类的条件生成模型基于用户定义的文本生成图像。查找和完善产生所需图像的提示已成为迅速工程的艺术。生成模型不能为通过提示表达的用户信息需求提供内置检索模型。鉴于广泛的文献综述,我们将促使生成模型的工程重新制作,作为基于文本的互动文本检索,对新型的“无限索引”进行了回收。我们在与专家有关游戏设计图像生成的案例研究中首次应用这些见解。最后,我们设想积极学习如何有助于指导生成图像的检索。
Conditional generative models such as DALL-E and Stable Diffusion generate images based on a user-defined text, the prompt. Finding and refining prompts that produce a desired image has become the art of prompt engineering. Generative models do not provide a built-in retrieval model for a user's information need expressed through prompts. In light of an extensive literature review, we reframe prompt engineering for generative models as interactive text-based retrieval on a novel kind of "infinite index". We apply these insights for the first time in a case study on image generation for game design with an expert. Finally, we envision how active learning may help to guide the retrieval of generated images.