U-HRNET：研究改善高分辨率网络的语义表示，以进行密集的预测

论文标题

U-HRNET：研究改善高分辨率网络的语义表示，以进行密集的预测

U-HRNet: Delving into Improving Semantic Representation of High Resolution Network for Dense Prediction

论文作者

Wang, Jian, Long, Xiang, Chen, Guowei, Wu, Zewu, Chen, Zeyu, Ding, Errui

论文摘要

高分辨率和高级语义表示对于密集的预测至关重要。从经验上讲，低分辨率特征地图通常可以实现更强的语义表示，而高分辨率特征图通常可以更好地识别局部特征，例如边缘，但包含较弱的语义信息。现有的最新框架（例如HRNET）并行了低分辨率和高分辨率特征图，并反复在不同的分辨率上交换信息。但是，我们认为，分辨率最低的特征图通常包含最强的语义信息，并且有必要经过更多的层与高分辨率特征图合并，而对于高分辨率特征图，每个卷积层的计算成本非常大，并且无需经过这么多层。因此，我们设计了一个U形的高分辨率网络（U-HRNET），该网络在功能映射后增加了更强的语义表示，并放松了HRNET中的约束，即所有分辨率都需要在新添加的阶段进行平行计算。将更多的计算分配给低分辨率特征图，从而显着改善了整体语义表示。 U-HRNET是HRNET主链的替代品，在完全相同的训练和推理环境下，可以对多种语义分割和深度预测数据集取得重大改进，而计算量几乎没有增加。代码可在Paddleseg上获得：https：//github.com/paddlepaddle/paddleseg。

High resolution and advanced semantic representation are both vital for dense prediction. Empirically, low-resolution feature maps often achieve stronger semantic representation, and high-resolution feature maps generally can better identify local features such as edges, but contains weaker semantic information. Existing state-of-the-art frameworks such as HRNet has kept low-resolution and high-resolution feature maps in parallel, and repeatedly exchange the information across different resolutions. However, we believe that the lowest-resolution feature map often contains the strongest semantic information, and it is necessary to go through more layers to merge with high-resolution feature maps, while for high-resolution feature maps, the computational cost of each convolutional layer is very large, and there is no need to go through so many layers. Therefore, we designed a U-shaped High-Resolution Network (U-HRNet), which adds more stages after the feature map with strongest semantic representation and relaxes the constraint in HRNet that all resolutions need to be calculated parallel for a newly added stage. More calculations are allocated to low-resolution feature maps, which significantly improves the overall semantic representation. U-HRNet is a substitute for the HRNet backbone and can achieve significant improvement on multiple semantic segmentation and depth prediction datasets, under the exactly same training and inference setting, with almost no increasing in the amount of calculation. Code is available at PaddleSeg: https://github.com/PaddlePaddle/PaddleSeg.

下载PDF全文

下载文献需遵守相关版权规定

论文标题