在线卷积重新参数化

论文标题

在线卷积重新参数化

Online Convolutional Re-parameterization

论文作者

Hu, Mu, Feng, Junyi, Hua, Jiashen, Lai, Baisheng, Huang, Jianqiang, Gong, Xiaojin, Hua, Xiansheng

论文摘要

在各种计算机视觉任务中，结构性重新参数化引起了人们的关注。它旨在提高深层模型的性能，而无需引入任何推理时间成本。尽管在推断期间有效，但这种模型在很大程度上依赖于复杂的训练时间块来实现高精度，从而导致额外的培训成本。在本文中，我们提出了在线卷积重新参数化（OREPA），这是一条两阶段的管道，旨在通过将复杂的训练时间块挤压为单个卷积，以减少巨大的训练开销。为了实现此目标，我们引入了一个线性缩放层，以更好地优化在线块。在降低的培训成本方面，我们还探索了一些更有效的重新帕拉姆组件。与最先进的重新帕拉姆模型相比，OREPA能够将训练时间内存成本降低约70％，并将训练速度加速约2倍。同时，配备了OREPA，这些模型的表现优于ImageNet上的先前方法，高达 +0.6％。我们还对对象检测和语义分割进行了实验，并在下游任务上显示出一致的改进。代码可在https://github.com/jugghm/orepa_cvpr2r2022上找到。

Structural re-parameterization has drawn increasing attention in various computer vision tasks. It aims at improving the performance of deep models without introducing any inference-time cost. Though efficient during inference, such models rely heavily on the complicated training-time blocks to achieve high accuracy, leading to large extra training cost. In this paper, we present online convolutional re-parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution. To achieve this goal, we introduce a linear scaling layer for better optimizing the online blocks. Assisted with the reduced training cost, we also explore some more effective re-param components. Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x. Meanwhile, equipped with OREPA, the models outperform previous methods on ImageNet by up to +0.6%.We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks. Codes are available at https://github.com/JUGGHM/OREPA_CVPR2022 .

下载PDF全文

下载文献需遵守相关版权规定

论文标题