论文标题
多晶蒸馏方案,针对轻量级半监督语义分割
Multi-Granularity Distillation Scheme Towards Lightweight Semi-Supervised Semantic Segmentation
论文作者
论文摘要
尽管在半监督语义分割领域的进度程度不同,但其最近的大部分成功都涉及笨拙的模型,并且尚未探索轻量级解决方案。我们发现,现有的知识蒸馏技术会更加关注标签数据中的像素级概念,而该数据未能在未标记的数据中考虑更有用的线索。因此,我们提供了首次尝试通过新颖的多晶蒸馏(MGD)方案提供轻量级SSSS模型,其中从三个方面捕获了多个跨性别:i)互补的教师结构; ii)标记为未标记的数据合作蒸馏; iii)分层和多层次损失设置。具体而言,MGD被配制为标记的未标记数据合作蒸馏方案,该方案有助于充分利用在半监督环境中必不可少的不同数据特征。图像级别对语义敏感的损失,区域级别感知的损失和像素级的一致性损失是通过结构互补的教师来丰富分层蒸馏抽象的。 Pascal VOC2012和CityScapes的实验结果表明,在不同的分区协议下,MGD可以超越竞争方法。例如,在1/16的CityScapes分区协议下,RESNET-18和MOBILENET-V2主链的性能分别提高了11.5%和4.6%。尽管模型骨干的触发器被3.4-5.3倍(RESNET-18)和38.7-59.6X(MobileNetV2)压缩,但该模型设法实现了令人满意的分割结果。
Albeit with varying degrees of progress in the field of Semi-Supervised Semantic Segmentation, most of its recent successes are involved in unwieldy models and the lightweight solution is still not yet explored. We find that existing knowledge distillation techniques pay more attention to pixel-level concepts from labeled data, which fails to take more informative cues within unlabeled data into account. Consequently, we offer the first attempt to provide lightweight SSSS models via a novel multi-granularity distillation (MGD) scheme, where multi-granularity is captured from three aspects: i) complementary teacher structure; ii) labeled-unlabeled data cooperative distillation; iii) hierarchical and multi-levels loss setting. Specifically, MGD is formulated as a labeled-unlabeled data cooperative distillation scheme, which helps to take full advantage of diverse data characteristics that are essential in the semi-supervised setting. Image-level semantic-sensitive loss, region-level content-aware loss, and pixel-level consistency loss are set up to enrich hierarchical distillation abstraction via structurally complementary teachers. Experimental results on PASCAL VOC2012 and Cityscapes reveal that MGD can outperform the competitive approaches by a large margin under diverse partition protocols. For example, the performance of ResNet-18 and MobileNet-v2 backbone is boosted by 11.5% and 4.6% respectively under 1/16 partition protocol on Cityscapes. Although the FLOPs of the model backbone is compressed by 3.4-5.3x (ResNet-18) and 38.7-59.6x (MobileNetv2), the model manages to achieve satisfactory segmentation results.