DPANET：RGB-D显着对象检测的深度潜力感知的封闭式注意力网络

论文标题

DPANET：RGB-D显着对象检测的深度潜力感知的封闭式注意力网络

DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection

论文作者

Chen, Zuyao, Cong, Runmin, Xu, Qianqian, Huang, Qingming

论文摘要

RGB-D显着对象检测中有两个主要问题：（1）如何有效地整合跨模式RGB-D数据的互补性；（2）如何从不可靠的深度图中预防污染效果。实际上，这两个问题是链接和交织在一起的，但是先前的方法倾向于仅关注第一个问题，而忽略了深度图质量的考虑，这可能会导致该模型属于次优状态。在本文中，我们在整体模型中协同解决了这两个问题，并提出了一个名为DPANET的新型网络，以明确地对深度图的潜力进行建模，并有效地整合了跨模式互补性。通过引入深度潜力感知，网络可以以基于学习的方式感知深度信息的潜力，并指导两个模态数据的融合过程以防止发生污染。融合过程中的封闭式多模式注意模块利用栅极控制器利用注意机制，从跨模式的角度捕获远程依赖性。与8个数据集上的15种最先进方法相比，实验结果证明了所提出的方法的有效性，既有定量和定性。

There are two main issues in RGB-D salient object detection: (1) how to effectively integrate the complementarity from the cross-modal RGB-D data; (2) how to prevent the contamination effect from the unreliable depth map. In fact, these two problems are linked and intertwined, but the previous methods tend to focus only on the first problem and ignore the consideration of depth map quality, which may yield the model fall into the sub-optimal state. In this paper, we address these two issues in a holistic model synergistically, and propose a novel network named DPANet to explicitly model the potentiality of the depth map and effectively integrate the cross-modal complementarity. By introducing the depth potentiality perception, the network can perceive the potentiality of depth information in a learning-based manner, and guide the fusion process of two modal data to prevent the contamination occurred. The gated multi-modality attention module in the fusion process exploits the attention mechanism with a gate controller to capture long-range dependencies from a cross-modal perspective. Experimental results compared with 15 state-of-the-art methods on 8 datasets demonstrate the validity of the proposed approach both quantitatively and qualitatively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题