语义细分的分层多尺度关注

论文标题

语义细分的分层多尺度关注

Hierarchical Multi-Scale Attention for Semantic Segmentation

论文作者

Tao, Andrew, Sapra, Karan, Catanzaro, Bryan

论文摘要

多尺度推断通常用于改善语义分割的结果。多个图像量表通过网络传递，然后将结果与平均或最大池合并。在这项工作中，我们提出了一种基于注意力的方法来结合多尺度预测。我们表明，在某些尺度上的预测更好地解决了特定的故障模式，并且网络学会了在此类情况下偏爱这些量表以产生更好的预测。我们的注意力机制是分层的，它使其比最近的方法要高出大约4倍的记忆力训练。除了实现更快的训练外，这还使我们能够使用更大的农作物尺寸训练，从而提高模型的准确性。我们在两个数据集上演示了方法的结果：CityScapes和Mapillary Vistas。对于具有大量弱标记图像的CityScapes，我们还利用自动标签来改善概括。利用我们的方法，我们获得了新的最新最先进的结果，可导致Mapillary（61.1 iou Val）和CityScapes（85.1 IOU测试）。

Multi-scale inference is commonly used to improve the results of semantic segmentation. Multiple images scales are passed through a network and then the results are combined with averaging or max pooling. In this work, we present an attention-based approach to combining multi-scale predictions. We show that predictions at certain scales are better at resolving particular failures modes, and that the network learns to favor those scales for such cases in order to generate better predictions. Our attention mechanism is hierarchical, which enables it to be roughly 4x more memory efficient to train than other recent approaches. In addition to enabling faster training, this allows us to train with larger crop sizes which leads to greater model accuracy. We demonstrate the result of our method on two datasets: Cityscapes and Mapillary Vistas. For Cityscapes, which has a large number of weakly labelled images, we also leverage auto-labelling to improve generalization. Using our approach we achieve a new state-of-the-art results in both Mapillary (61.1 IOU val) and Cityscapes (85.1 IOU test).

下载PDF全文

下载文献需遵守相关版权规定

论文标题