论文标题
RGB-D显着检测的协同显着性和深度预测
Synergistic saliency and depth prediction for RGB-D saliency detection
论文作者
论文摘要
当RGB通道的图/接地线索较弱时,RGB-D摄像头可用的深度信息可用于分割显着对象。这促使开发了几个RGB-D显着数据集和算法,这些数据集和算法都使用RGB-D数据的所有四个渠道进行培训和推理。不幸的是,现有的RGB-D显着性数据集很小,这可能导致过度拟合和有限的多样场景概括。在这里,我们提出了一个半监督系统,用于RGB-D显着性检测,可以在较小的RGB-D显着性数据集上进行培训,而无需显着地面真相,同时还可以有效地共同使用具有显着地面真相的大型RGB显着性数据集。为了将我们的方法推广到RGB-D显着性数据集中,一种新颖的预测引导的交叉装置模块,通过在两个相应的任务之间相互完善,共同估计显着性和深度,并采用了一种对抗性学习方法。至关重要的是,我们的系统不需要RGB-D数据集的显着性基础,这可以节省大量的人工进行手部标记,并且不需要深入数据进行推理,从而允许该方法用于仅RGB数据可用的更广泛的应用程序范围。对七个RGB-D数据集进行的评估表明,即使没有RGB-D数据集的显着基础真相,并且仅在推理时仅使用RGB-D数据集的RGB数据,我们的半监督系统也可以根据最先进的全面治疗的RGB-D显着性检测方法来对RGB-D数据进行训练和深度数据集进行训练和深度研究。我们的方法还可以在其他流行的RGB-D显着基准上取得可比的结果。
Depth information available from an RGB-D camera can be useful in segmenting salient objects when figure/ground cues from RGB channels are weak. This has motivated the development of several RGB-D saliency datasets and algorithms that use all four channels of the RGB-D data for both training and inference. Unfortunately, existing RGB-D saliency datasets are small, which may lead to overfitting and limited generalization for diverse scenarios. Here we propose a semi-supervised system for RGB-D saliency detection that can be trained on smaller RGB-D saliency datasets without saliency ground truth, while also make effective joint use of a large RGB saliency dataset with saliency ground truth together. To generalize our method on RGB-D saliency datasets, a novel prediction-guided cross-refinement module which jointly estimates both saliency and depth by mutual refinement between two respective tasks, and an adversarial learning approach are employed. Critically, our system does not require saliency ground-truth for the RGB-D datasets, which saves the massive human labor for hand labeling, and does not require the depth data for inference, allowing the method to be used for the much broader range of applications where only RGB data are available. Evaluation on seven RGB-D datasets demonstrates that even without saliency ground truth for RGB-D datasets and using only the RGB data of RGB-D datasets at inference, our semi-supervised system performs favorable against state-of-the-art fully-supervised RGB-D saliency detection methods that use saliency ground truth for RGB-D datasets at training and depth data at inference on two largest testing datasets. Our approach also achieves comparable results on other popular RGB-D saliency benchmarks.