论文标题

重新思考数据表示的批处理样本关系:基于批处理变压器的方法

Rethinking Batch Sample Relationships for Data Representation: A Batch-Graph Transformer based Approach

论文作者

Wang, Xixi, Jiang, Bo, Wang, Xiao, Luo, Bin

论文摘要

探索每个迷你批次中的样本关系显示出了学习图像表示的巨大潜力。现有作品通常采用常规变压器来对视觉内容关系进行建模,忽略了样本之间的语义/标签相关性线索。同样,他们通常采用“完整”的自我注意事项机制,这些机制显然是多余的,对嘈杂的样本也很敏感。为了克服这些问题,在本文中,我们通过从视觉和语义角度深入捕获图像样本的关系来设计一个简单而灵活的批处理变压器(BGFORMER),以实现迷你批次样本表示。 BGFormer具有三个主要方面。 (1)它采用灵活的图形模型,称为批处理图来共同编码每个迷你批次中样本的视觉和语义关系。 (2)它通过借用稀疏图表示的概念来探讨样本的邻域关系,稀疏图表示,从而稳健地执行噪音样本。 (3)它设计了一种新颖的变压器体系结构,该结构主要采用双重结构受限的自我注意力(SSA),以及图归一化,FFN等,以仔细利用批处理图信息以用于样品令牌(节点)表示。作为应用程序,我们将BGFormer应用于度量学习任务。在四个流行数据集上进行的广泛实验证明了该模型的有效性。

Exploring sample relationships within each mini-batch has shown great potential for learning image representations. Existing works generally adopt the regular Transformer to model the visual content relationships, ignoring the cues of semantic/label correlations between samples. Also, they generally adopt the "full" self-attention mechanism which are obviously redundant and also sensitive to the noisy samples. To overcome these issues, in this paper, we design a simple yet flexible Batch-Graph Transformer (BGFormer) for mini-batch sample representations by deeply capturing the relationships of image samples from both visual and semantic perspectives. BGFormer has three main aspects. (1) It employs a flexible graph model, termed Batch Graph to jointly encode the visual and semantic relationships of samples within each mini-batch. (2) It explores the neighborhood relationships of samples by borrowing the idea of sparse graph representation which thus performs robustly, w.r.t., noisy samples. (3) It devises a novel Transformer architecture that mainly adopts dual structure-constrained self-attention (SSA), together with graph normalization, FFN, etc, to carefully exploit the batch graph information for sample tokens (nodes) representations. As an application, we apply BGFormer to the metric learning tasks. Extensive experiments on four popular datasets demonstrate the effectiveness of the proposed model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源