论文标题

用自然语言描述纹理

Describing Textures using Natural Language

论文作者

Wu, Chenyun, Timm, Mikayla, Maji, Subhransu

论文摘要

自然图像中的纹理可以以颜色,形状,内部元素的周期性以及可以使用自然语言描述的其他属性来表征。在本文中,我们研究了在包含丰富纹理描述的新型数据集中描述纹理视觉属性的问题,并对当前的生成和判别模型进行了系统的研究,以将语言接地到该数据集上的图像。我们发现,尽管这些模型捕获了一些纹理的属性,但它们无法捕获几种构图属性,例如点的颜色。我们通过产生具有不同描述的合成但现实的纹理来对现有模型进行批判性分析。我们的数据集还允许我们训练可解释的模型,并生成基于语言的解释,以说明深网的深网已经学到了哪些判别特征,以实现纹理起关键作用的细粒度分类。我们介绍了几个细粒域的可视化,并表明在我们的数据集中学到的纹理属性提供了比Caltech-UCSD鸟类数据集对专家设计的属性的改进。

Textures in natural images can be characterized by color, shape, periodicity of elements within them, and other attributes that can be described using natural language. In this paper, we study the problem of describing visual attributes of texture on a novel dataset containing rich descriptions of textures, and conduct a systematic study of current generative and discriminative models for grounding language to images on this dataset. We find that while these models capture some properties of texture, they fail to capture several compositional properties, such as the colors of dots. We provide critical analysis of existing models by generating synthetic but realistic textures with different descriptions. Our dataset also allows us to train interpretable models and generate language-based explanations of what discriminative features are learned by deep networks for fine-grained categorization where texture plays a key role. We present visualizations of several fine-grained domains and show that texture attributes learned on our dataset offer improvements over expert-designed attributes on the Caltech-UCSD Birds dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源