所有线性区域是否相等？

论文标题

所有线性区域是否相等？

Are All Linear Regions Created Equal?

论文作者

Gamba, Matteo, Chmielewski-Anders, Adrian, Sullivan, Josephine, Azizpour, Hossein, Björkman, Mårten

论文摘要

已将线性区域的数量研究为Relu网络的复杂性代表。但是，网络压缩技术（如修剪和知识蒸馏）的经验成功表明，在过度参数化的环境中，线性区域密度可能无法捕获有效的非线性。在这项工作中，我们提出了一种用于发现线性区域的有效算法，并使用它来研究密度在捕获训练有素的VGG和Resnets在CIFAR-10和CIFAR-100上的非线性方面的有效性。我们将结果与基于功能变化的更原则的非线性度量进行对比，突出了线性区域密度的缺点。此外，有趣的是，我们对非线性的度量显然与模型深度下降相关，将减少的测试误差与降低的非线性联系起来，并增加了线性区域的局部相似性。

The number of linear regions has been studied as a proxy of complexity for ReLU networks. However, the empirical success of network compression techniques like pruning and knowledge distillation, suggest that in the overparameterized setting, linear regions density might fail to capture the effective nonlinearity. In this work, we propose an efficient algorithm for discovering linear regions and use it to investigate the effectiveness of density in capturing the nonlinearity of trained VGGs and ResNets on CIFAR-10 and CIFAR-100. We contrast the results with a more principled nonlinearity measure based on function variation, highlighting the shortcomings of linear regions density. Furthermore, interestingly, our measure of nonlinearity clearly correlates with model-wise deep double descent, connecting reduced test error with reduced nonlinearity, and increased local similarity of linear regions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题