论文标题
用于超级分辨率的码头:寻找有效的量化型构造,以防止量化噪声
QuantNAS for super resolution: searching for efficient quantization-friendly architectures against quantization noise
论文作者
论文摘要
对于图像超分辨率的高绩效和计算高效的神经网络模型始终需要:计算有效的模型可以通过低容量设备使用并减少碳足迹。获得此类模型的一种方法是压缩模型,例如量化。另一种方法是一种神经体系结构搜索,该搜索自动发现了新的,更高效的解决方案。我们提出了一种新颖的量化感知程序,即结合了这两种方法的优点的量子。为了使Quontnas起作用,该过程寻找量化友好的超分辨率模型。该方法利用熵正则化,量化噪声和自适应偏差进行量化(ADQ)模块来增强搜索过程。熵正则技术优先考虑搜索空间每个块内的单个操作。在量化后,将量化噪声添加到参数和激活后近似模型降解,从而产生更量化的体系结构。 ADQ有助于减轻超分辨率模型中批处理规范块引起的问题。我们的实验结果表明,所提出的近似值比直接模型量化更好。 Quantnas比固定体系结构的均匀或混合精度量化发现具有更好的PSNR/BITOP权衡的体系结构。我们通过将其应用于最新的SR模型和RFDN启发的两个搜索空间来展示我们的方法的有效性。因此,任何人都可以根据现有体系结构设计适当的搜索空间,并应用我们的方法来获得提高质量和效率。 所提出的程序比直接权重量化快30 \%,并且更稳定。
There is a constant need for high-performing and computationally efficient neural network models for image super-resolution: computationally efficient models can be used via low-capacity devices and reduce carbon footprints. One way to obtain such models is to compress models, e.g. quantization. Another way is a neural architecture search that automatically discovers new, more efficient solutions. We propose a novel quantization-aware procedure, the QuantNAS that combines pros of these two approaches. To make QuantNAS work, the procedure looks for quantization-friendly super-resolution models. The approach utilizes entropy regularization, quantization noise, and Adaptive Deviation for Quantization (ADQ) module to enhance the search procedure. The entropy regularization technique prioritizes a single operation within each block of the search space. Adding quantization noise to parameters and activations approximates model degradation after quantization, resulting in a more quantization-friendly architectures. ADQ helps to alleviate problems caused by Batch Norm blocks in super-resolution models. Our experimental results show that the proposed approximations are better for search procedure than direct model quantization. QuantNAS discovers architectures with better PSNR/BitOps trade-off than uniform or mixed precision quantization of fixed architectures. We showcase the effectiveness of our method through its application to two search spaces inspired by the state-of-the-art SR models and RFDN. Thus, anyone can design a proper search space based on an existing architecture and apply our method to obtain better quality and efficiency. The proposed procedure is 30\% faster than direct weight quantization and is more stable.