论文标题

用于零拍学习的混合路由变压器

Hybrid Routing Transformer for Zero-Shot Learning

论文作者

Cheng, De, Wang, Gerong, Wang, Bo, Zhang, Qiang, Han, Jungong, Zhang, Dingwen

论文摘要

零击学习(ZSL)旨在学习基于使用可见语义的数据训练的训练,可以识别看不见的图像语义。最近的研究要么利用全球图像特征或矿山歧视性局部贴片特征将提取的视觉特征与语义属性相关联。但是,由于缺乏必要的自上而下的指导和语义一致性,以确保模型参与实际属性相关区域,因此这些方法仍然遇到视觉方式和属性模态之间的显着语义差距,这使得它们对看不见的语义的预测变得不可靠。为了解决这个问题,本文建立了一种新型的变压器编码器模型,称为混合路由变压器(HRT)。在HRT编码器中,我们嵌入了主动注意力,该注意由自下而上和自上而下的动态路由途径构建,以生成属性对齐的视觉特征。在HRT解码器中,我们使用静态路由来计算属性分配的视觉特征,相应的属性语义和类属性向量之间的相关性,以生成最终类标签预测。该设计使提出的变压器模型成为1)自上而下和自下而上的注意途径以及2)动态和静态路由途径。进行了三个广泛使用的基准数据集的全面实验,即Cub,Sun和Awa2。获得的实验结果证明了该方法的有效性。

Zero-shot learning (ZSL) aims to learn models that can recognize unseen image semantics based on the training of data with seen semantics. Recent studies either leverage the global image features or mine discriminative local patch features to associate the extracted visual features to the semantic attributes. However, due to the lack of the necessary top-down guidance and semantic alignment for ensuring the model attending to the real attribute-correlation regions, these methods still encounter a significant semantic gap between the visual modality and the attribute modality, which makes their prediction on unseen semantics unreliable. To solve this problem, this paper establishes a novel transformer encoder-decoder model, called hybrid routing transformer (HRT). In HRT encoder, we embed an active attention, which is constructed by both the bottom-up and the top-down dynamic routing pathways to generate the attribute-aligned visual feature. While in HRT decoder, we use static routing to calculate the correlation among the attribute-aligned visual features, the corresponding attribute semantics, and the class attribute vectors to generate the final class label predictions. This design makes the presented transformer model a hybrid of 1) top-down and bottom-up attention pathways and 2) dynamic and static routing pathways. Comprehensive experiments on three widely-used benchmark datasets, namely CUB, SUN, and AWA2, are conducted. The obtained experimental results demonstrate the effectiveness of the proposed method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源