论文标题
解决语义细分中数据和注释有限的问题
Tackling the Problem of Limited Data and Annotations in Semantic Segmentation
论文作者
论文摘要
在这项工作中,在小图像数据集上进行语义分割的情况(由Pascal VOC 2012中的1000个随机选择的图像模拟),其中仅研究了弱监督信号(来自用户互动的涂鸦)。特别是,为了解决图像分割中数据注释有限的问题,应用不同的预训练模型和基于CRF的方法来增强分割性能。为此,在DeepLab-V2基线中将ROTNET,DeeperCluster和Semi和Semi和Semi和弱监督学习(SWSL)进行了预训练的模型,并且将密集的CRF用于后处理和损失正则化技术。我的研究结果表明,在这个小型数据集上,使用预训练的RESNET50 SWSL模型可以比应用ImageNet预训练模型好7.4%。此外,对于对Pascal VOC 2012培训数据进行培训的情况,这种预训练方法将MIOU结果提高了几乎4%。另一方面,密集的CRF也非常有效,可以在弱监督培训和后处理工具中作为一种损失正规化技术增强结果。
In this work, the case of semantic segmentation on a small image dataset (simulated by 1000 randomly selected images from PASCAL VOC 2012), where only weak supervision signals (scribbles from user interaction) are available is studied. Especially, to tackle the problem of limited data annotations in image segmentation, transferring different pre-trained models and CRF based methods are applied to enhance the segmentation performance. To this end, RotNet, DeeperCluster, and Semi&Weakly Supervised Learning (SWSL) pre-trained models are transferred and finetuned in a DeepLab-v2 baseline, and dense CRF is applied both as a post-processing and loss regularization technique. The results of my study show that, on this small dataset, using a pre-trained ResNet50 SWSL model gives results that are 7.4% better than applying an ImageNet pre-trained model; moreover, for the case of training on the full PASCAL VOC 2012 training data, this pre-training approach increases the mIoU results by almost 4%. On the other hand, dense CRF is shown to be very effective as well, enhancing the results both as a loss regularization technique in weakly supervised training and as a post-processing tool.