论文标题
在线卷积重新参数化
Online Convolutional Re-parameterization
论文作者
论文摘要
在各种计算机视觉任务中,结构性重新参数化引起了人们的关注。它旨在提高深层模型的性能,而无需引入任何推理时间成本。尽管在推断期间有效,但这种模型在很大程度上依赖于复杂的训练时间块来实现高精度,从而导致额外的培训成本。在本文中,我们提出了在线卷积重新参数化(OREPA),这是一条两阶段的管道,旨在通过将复杂的训练时间块挤压为单个卷积,以减少巨大的训练开销。为了实现此目标,我们引入了一个线性缩放层,以更好地优化在线块。在降低的培训成本方面,我们还探索了一些更有效的重新帕拉姆组件。与最先进的重新帕拉姆模型相比,OREPA能够将训练时间内存成本降低约70%,并将训练速度加速约2倍。同时,配备了OREPA,这些模型的表现优于ImageNet上的先前方法,高达 +0.6%。我们还对对象检测和语义分割进行了实验,并在下游任务上显示出一致的改进。代码可在https://github.com/jugghm/orepa_cvpr2r2022上找到。
Structural re-parameterization has drawn increasing attention in various computer vision tasks. It aims at improving the performance of deep models without introducing any inference-time cost. Though efficient during inference, such models rely heavily on the complicated training-time blocks to achieve high accuracy, leading to large extra training cost. In this paper, we present online convolutional re-parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution. To achieve this goal, we introduce a linear scaling layer for better optimizing the online blocks. Assisted with the reduced training cost, we also explore some more effective re-param components. Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x. Meanwhile, equipped with OREPA, the models outperform previous methods on ImageNet by up to +0.6%.We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks. Codes are available at https://github.com/JUGGHM/OREPA_CVPR2022 .