用化妆单词对图像生成的对抗性攻击

论文标题

用化妆单词对图像生成的对抗性攻击

Adversarial Attacks on Image Generation With Made-Up Words

论文作者

Millière, Raphaël

论文摘要

可以提示文本指导的图像生成模型，使用具有对手设计的非CEN词来生成图像，以唤起特定的视觉概念。引入了这一一代的两种方法：杏仁核提示，其中涉及通过将不同语言的子词单元串联来设计隐秘的混合单词；和令人回味的提示，其中涉及设计其广泛形态特征的非CE单词，足以触发可靠的视觉关联。这两种方法也可以合并以生成与更具体的视觉概念相关联的图像。讨论了这些技术对汇总现有内容适度方法的含义，尤其是对冒犯性或有害图像的产生。

Text-guided image generation models can be prompted to generate images using nonce words adversarially designed to robustly evoke specific visual concepts. Two approaches for such generation are introduced: macaronic prompting, which involves designing cryptic hybrid words by concatenating subword units from different languages; and evocative prompting, which involves designing nonce words whose broad morphological features are similar enough to that of existing words to trigger robust visual associations. The two methods can also be combined to generate images associated with more specific visual concepts. The implications of these techniques for the circumvention of existing approaches to content moderation, and particularly the generation of offensive or harmful images, are discussed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题