论文标题
GroupTransnet:用于RGB-D显着对象检测的组变压器网络
GroupTransNet: Group Transformer Network for RGB-D Salient Object Detection
论文作者
论文摘要
RGB-D图像上的显着对象检测是计算机视觉中的一个主题。尽管现有方法取得了可观的性能,但仍然存在一些挑战。卷积神经网络的局部性要求该模型具有足够深的全球接收场,这总是导致局部细节的丧失。为了应对挑战,我们提出了一个新型的组变压器网络(GroupTransnet),以进行RGB-D显着对象检测。该方法擅长学习跨层特征的远程依赖性,以促进更完美的特征表达。一开始,中间三个级别和后三个级别的特征稍高,可以柔软,以吸收高级特征的优势。通过注意机制反复纯化和增强输入特征,以纯化颜色模态和深度模态的交叉模态特征。中间过程的特征首先是由不同层的特征融合的,然后由多个组中的几个变压器处理,这不仅使每个比例尺统一和相互关联的特征的大小,而且还达到了组中分享特征重量的效果。由于水平差异,不同组中的输出特征完成了两个群集的交流,并与低级特征结合在一起。广泛的实验表明,GroupTransnet优于比较模型,并实现了新的最新性能。
Salient object detection on RGB-D images is an active topic in computer vision. Although the existing methods have achieved appreciable performance, there are still some challenges. The locality of convolutional neural network requires that the model has a sufficiently deep global receptive field, which always leads to the loss of local details. To address the challenge, we propose a novel Group Transformer Network (GroupTransNet) for RGB-D salient object detection. This method is good at learning the long-range dependencies of cross layer features to promote more perfect feature expression. At the beginning, the features of the slightly higher classes of the middle three levels and the latter three levels are soft grouped to absorb the advantages of the high-level features. The input features are repeatedly purified and enhanced by the attention mechanism to purify the cross modal features of color modal and depth modal. The features of the intermediate process are first fused by the features of different layers, and then processed by several transformers in multiple groups, which not only makes the size of the features of each scale unified and interrelated, but also achieves the effect of sharing the weight of the features within the group. The output features in different groups complete the clustering staggered by two owing to the level difference, and combine with the low-level features. Extensive experiments demonstrate that GroupTransNet outperforms the comparison models and achieves the new state-of-the-art performance.