论文标题
PP-Liteseg:优质的实时语义分割模型
PP-LiteSeg: A Superior Real-Time Semantic Segmentation Model
论文作者
论文摘要
现实世界应用对语义分割方法的需求很高。尽管语义细分已经通过深度学习使出色的跃跃欲试,但实时方法的性能并不令人满意。在这项工作中,我们提出了PP-Liteseg,这是一种用于实时语义分割任务的新型轻型模型。具体而言,我们提出了一个灵活且轻巧的解码器(FLD),以减少上一个解码器的计算开销。为了加强特征表示形式,我们提出了一个统一的注意融合模块(UAFM),该模块利用空间和引导注意力产生重量,然后将输入特征与重量融合在一起。此外,提出了一个简单的金字塔池模块(SPPM),以汇总全局上下文,计算成本较低。广泛的评估表明,与其他方法相比,PP-Liteseg在准确性和速度之间取得了较高的权衡。在CityScapes测试集中,PP-Liteseg在NVIDIA GTX 1080TI上实现72.0%MIOU/273.6 fps和77.5%MIOU/102.6 fps。源代码和型号可在Paddleseg上获得:https://github.com/paddlepaddle/paddleseg。
Real-world applications have high demands for semantic segmentation methods. Although semantic segmentation has made remarkable leap-forwards with deep learning, the performance of real-time methods is not satisfactory. In this work, we propose PP-LiteSeg, a novel lightweight model for the real-time semantic segmentation task. Specifically, we present a Flexible and Lightweight Decoder (FLD) to reduce computation overhead of previous decoder. To strengthen feature representations, we propose a Unified Attention Fusion Module (UAFM), which takes advantage of spatial and channel attention to produce a weight and then fuses the input features with the weight. Moreover, a Simple Pyramid Pooling Module (SPPM) is proposed to aggregate global context with low computation cost. Extensive evaluations demonstrate that PP-LiteSeg achieves a superior trade-off between accuracy and speed compared to other methods. On the Cityscapes test set, PP-LiteSeg achieves 72.0% mIoU/273.6 FPS and 77.5% mIoU/102.6 FPS on NVIDIA GTX 1080Ti. Source code and models are available at PaddleSeg: https://github.com/PaddlePaddle/PaddleSeg.