当地的Lipschitz深神经网络的边界

论文标题

当地的Lipschitz深神经网络的边界

Local Lipschitz Bounds of Deep Neural Networks

论文作者

Herrera, Calypso, Krach, Florian, Teichmann, Josef

论文摘要

Lipschitz常数是分析基于梯度优化方法的收敛性的重要数量。通常不清楚如何估计复杂模型的Lipschitz常数。因此，本文研究了一个重要的问题，可能对非凸优化的更广泛领域有用。主要结果为多层进发神经网络及其梯度的Lipschitz常数提供了局部上限。此外，还建立了下限，用于表明它不可能为Lipschitz常数得出全球上限。与以前的工作相反，我们计算了Lipschitz常数相对于网络参数，而不是相对于输入。对于基于梯度的优化方案及其收敛分析的许多步长调度程序的理论描述，需要这些常数。这个想法既简单又有效。结果扩展到神经网络的概括，即连续深度的神经网络，这些神经网络由受控的ODE描述。

The Lipschitz constant is an important quantity that arises in analysing the convergence of gradient-based optimization methods. It is generally unclear how to estimate the Lipschitz constant of a complex model. Thus, this paper studies an important problem that may be useful to the broader area of non-convex optimization. The main result provides a local upper bound on the Lipschitz constants of a multi-layer feed-forward neural network and its gradient. Moreover, lower bounds are established as well, which are used to show that it is impossible to derive global upper bounds for the Lipschitz constants. In contrast to previous works, we compute the Lipschitz constants with respect to the network parameters and not with respect to the inputs. These constants are needed for the theoretical description of many step size schedulers of gradient based optimization schemes and their convergence analysis. The idea is both simple and effective. The results are extended to a generalization of neural networks, continuously deep neural networks, which are described by controlled ODEs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题