PP-Liteseg：优质的实时语义分割模型

论文标题

PP-Liteseg：优质的实时语义分割模型

PP-LiteSeg: A Superior Real-Time Semantic Segmentation Model

论文作者

Peng, Juncai, Liu, Yi, Tang, Shiyu, Hao, Yuying, Chu, Lutao, Chen, Guowei, Wu, Zewu, Chen, Zeyu, Yu, Zhiliang, Du, Yuning, Dang, Qingqing, Lai, Baohua, Liu, Qiwen, Hu, Xiaoguang, Yu, Dianhai, Ma, Yanjun

论文摘要

现实世界应用对语义分割方法的需求很高。尽管语义细分已经通过深度学习使出色的跃跃欲试，但实时方法的性能并不令人满意。在这项工作中，我们提出了PP-Liteseg，这是一种用于实时语义分割任务的新型轻型模型。具体而言，我们提出了一个灵活且轻巧的解码器（FLD），以减少上一个解码器的计算开销。为了加强特征表示形式，我们提出了一个统一的注意融合模块（UAFM），该模块利用空间和引导注意力产生重量，然后将输入特征与重量融合在一起。此外，提出了一个简单的金字塔池模块（SPPM），以汇总全局上下文，计算成本较低。广泛的评估表明，与其他方法相比，PP-Liteseg在准确性和速度之间取得了较高的权衡。在CityScapes测试集中，PP-Liteseg在NVIDIA GTX 1080TI上实现72.0％MIOU/273.6 fps和77.5％MIOU/102.6 fps。源代码和型号可在Paddleseg上获得：https：//github.com/paddlepaddle/paddleseg。

Real-world applications have high demands for semantic segmentation methods. Although semantic segmentation has made remarkable leap-forwards with deep learning, the performance of real-time methods is not satisfactory. In this work, we propose PP-LiteSeg, a novel lightweight model for the real-time semantic segmentation task. Specifically, we present a Flexible and Lightweight Decoder (FLD) to reduce computation overhead of previous decoder. To strengthen feature representations, we propose a Unified Attention Fusion Module (UAFM), which takes advantage of spatial and channel attention to produce a weight and then fuses the input features with the weight. Moreover, a Simple Pyramid Pooling Module (SPPM) is proposed to aggregate global context with low computation cost. Extensive evaluations demonstrate that PP-LiteSeg achieves a superior trade-off between accuracy and speed compared to other methods. On the Cityscapes test set, PP-LiteSeg achieves 72.0% mIoU/273.6 FPS and 77.5% mIoU/102.6 FPS on NVIDIA GTX 1080Ti. Source code and models are available at PaddleSeg: https://github.com/PaddlePaddle/PaddleSeg.

下载PDF全文

下载文献需遵守相关版权规定

论文标题