RGB-D显着对象检测的分叉骨干策略

论文标题

RGB-D显着对象检测的分叉骨干策略

Bifurcated backbone strategy for RGB-D salient object detection

论文作者

Zhai, Yingjie, Fan, Deng-Ping, Yang, Jufeng, Borji, Ali, Shao, Ling, Han, Junwei, Wang, Liang

论文摘要

多层次功能融合是计算机视觉中的一个基本话题。它已被利用以在各种尺度上检测，细分和分类对象。当多层次功能符合多模式提示时，最佳特征聚合和多模式学习策略成为热马铃薯。在本文中，我们利用RGB-D显着对象检测的固有多模式和多层次性质来设计一种新型的级联改善网络。特别是，首先，我们建议使用分叉的骨干策略（BBS）将多层次功能重新组合到教师和学生特征中。其次，我们引入了一个深度增强的模块（DEM），以挖掘频道和空间视图的信息深度线索。然后，RGB和深度方式以互补的方式融合。我们的架构被称为分叉的骨干策略网络（BBS-NET），是简单，高效且独立于骨干的。广泛的实验表明，在五个评估措施下，BBS-NET在八个具有挑战性的数据集上明显优于18个SOTA模型，这表明我们方法的优势（$ \ sim 4 \％\％$ $ $ $ $改善S量级$ vs. $ vs. $最高模型：DMRA-ICCV2019）。此外，我们还提供了有关不同RGB-D数据集的概括能力的全面分析，并为未来的研究提供了强大的培训集。

Multi-level feature fusion is a fundamental topic in computer vision. It has been exploited to detect, segment and classify objects at various scales. When multi-level features meet multi-modal cues, the optimal feature aggregation and multi-modal learning strategy become a hot potato. In this paper, we leverage the inherent multi-modal and multi-level nature of RGB-D salient object detection to devise a novel cascaded refinement network. In particular, first, we propose to regroup the multi-level features into teacher and student features using a bifurcated backbone strategy (BBS). Second, we introduce a depth-enhanced module (DEM) to excavate informative depth cues from the channel and spatial views. Then, RGB and depth modalities are fused in a complementary way. Our architecture, named Bifurcated Backbone Strategy Network (BBS-Net), is simple, efficient, and backbone-independent. Extensive experiments show that BBS-Net significantly outperforms eighteen SOTA models on eight challenging datasets under five evaluation measures, demonstrating the superiority of our approach ($\sim 4 \%$ improvement in S-measure $vs.$ the top-ranked model: DMRA-iccv2019). In addition, we provide a comprehensive analysis on the generalization ability of different RGB-D datasets and provide a powerful training set for future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题