文本到图像生成的及时修饰符的分类法

论文标题

文本到图像生成的及时修饰符的分类法

A Taxonomy of Prompt Modifiers for Text-To-Image Generation

论文作者

Oppenlaender, Jonas

论文摘要

自2021年以来，文本到图像的生成就引起了人们的关注。如今，可以通过深层生成模型从文本输入（“提示”）中综合美丽而有趣的数字图像和艺术品。围绕文本图像生成和AI生成的艺术的在线社区很快就出现了。本文根据3个月的民族志研究确定了在线社区中从业者使用的六种类型的迅速修饰符。迅速修饰符的新颖分类学为研究人员提供了研究文本到图像生成实践的概念起点，但也可以帮助AI产生的ART的实践者改善其图像。我们进一步概述了如何在“及时工程”实践中应用及时修饰符。我们讨论了这种新颖的创意实践在人类计算机互动（HCI）领域的研究机会。本文最后讨论了从人类互动（HAI）（HAI）在未来的应用中，超出文本到图像生成和AI产生的艺术的用例，从人类互动（HAI）的角度讨论了更广泛的含义。

Text-to-image generation has seen an explosion of interest since 2021. Today, beautiful and intriguing digital images and artworks can be synthesized from textual inputs ("prompts") with deep generative models. Online communities around text-to-image generation and AI generated art have quickly emerged. This paper identifies six types of prompt modifiers used by practitioners in the online community based on a 3-month ethnographic study. The novel taxonomy of prompt modifiers provides researchers a conceptual starting point for investigating the practice of text-to-image generation, but may also help practitioners of AI generated art improve their images. We further outline how prompt modifiers are applied in the practice of "prompt engineering." We discuss research opportunities of this novel creative practice in the field of Human-Computer Interaction (HCI). The paper concludes with a discussion of broader implications of prompt engineering from the perspective of Human-AI Interaction (HAI) in future applications beyond the use case of text-to-image generation and AI generated art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题