用于语义分割的多接受场网络

论文标题

用于语义分割的多接受场网络

Multi Receptive Field Network for Semantic Segmentation

论文作者

Yuan, Jianlong, Deng, Zelu, Wang, Shu, Luo, Zhenbo

论文摘要

语义分割是计算机视觉中的关键任务之一，它是为图像中每个像素分配类别标签。尽管最近取得了重大进展，但大多数现有方法仍然遇到了两个具有挑战性的问题：1）图像中的物体和东西的大小可能非常多样化，要求将多尺度特征纳入完全卷积网络（FCN）； 2）由于卷积网络的固有弱点，很难对物体/物体的边界附近或处于界限上的像素进行分类。为了解决第一个问题，我们提出了一个新的多受理场模块（MRFM），明确考虑了多尺度功能。对于第二期，我们设计了一种有效区分对象/物体边界的边缘感知损失。通过这两种设计，我们的多种接收场网络在两个广泛使用的语义分割基准数据集上实现了新的最先进的结果。具体来说，我们在CityScapes数据集上实现了83.0的平均值，在Pascal VOC2012数据集中达到了88.4的平均值。

Semantic segmentation is one of the key tasks in computer vision, which is to assign a category label to each pixel in an image. Despite significant progress achieved recently, most existing methods still suffer from two challenging issues: 1) the size of objects and stuff in an image can be very diverse, demanding for incorporating multi-scale features into the fully convolutional networks (FCNs); 2) the pixels close to or at the boundaries of object/stuff are hard to classify due to the intrinsic weakness of convolutional networks. To address the first issue, we propose a new Multi-Receptive Field Module (MRFM), explicitly taking multi-scale features into account. For the second issue, we design an edge-aware loss which is effective in distinguishing the boundaries of object/stuff. With these two designs, our Multi Receptive Field Network achieves new state-of-the-art results on two widely-used semantic segmentation benchmark datasets. Specifically, we achieve a mean IoU of 83.0 on the Cityscapes dataset and 88.4 mean IoU on the Pascal VOC2012 dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题