论文标题
了解和改善群体归一化
Understanding and Improving Group Normalization
论文作者
论文摘要
已经提出了各种归一化层来帮助培训神经网络。组归一化(GN)是在视觉识别任务中实现出色表现的有效和有吸引力的研究之一。尽管取得了巨大的成功,但GN仍然存在几个问题,可能会对神经网络培训产生负面影响。在本文中,我们介绍了一个分析框架,并讨论了GN在影响神经网络训练过程时的工作原理。从实验结果中,我们得出结论GN对批处理归一化(BN)的较低表现的实际原因:1)\ TextBf {不稳定的训练性能},2)\ TextBf {更敏感}对失真,无论是来自正规化引入的外部噪声还是扰动。此外,我们发现GN只能在某个特定时期内帮助神经网络培训,而BN可以帮助整个培训中的网络。为了解决这些问题,我们提出了一个新的归一化层,该层是通过合并BN的优势在GN顶部建立的。图像分类任务的实验结果表明,所提出的归一化层的表现优于官方GN,无论批处理大小如何并稳定网络训练,都可以提高识别精度。
Various normalization layers have been proposed to help the training of neural networks. Group Normalization (GN) is one of the effective and attractive studies that achieved significant performances in the visual recognition task. Despite the great success achieved, GN still has several issues that may negatively impact neural network training. In this paper, we introduce an analysis framework and discuss the working principles of GN in affecting the training process of the neural network. From experimental results, we conclude the real cause of GN's inferior performance against Batch normalization (BN): 1) \textbf{unstable training performance}, 2) \textbf{more sensitive} to distortion, whether it comes from external noise or perturbations introduced by the regularization. In addition, we found that GN can only help the neural network training in some specific period, unlike BN, which helps the network throughout the training. To solve these issues, we propose a new normalization layer built on top of GN, by incorporating the advantages of BN. Experimental results on the image classification task demonstrated that the proposed normalization layer outperforms the official GN to improve recognition accuracy regardless of the batch sizes and stabilize the network training.