论文标题
解除了布雷格曼神经网络的培训
Lifted Bregman Training of Neural Networks
论文作者
论文摘要
我们引入了一种新型的数学公式,用于训练(可能是非平滑)近端图作为激活函数的喂养前向神经网络的训练。该公式基于Bregman距离,关键优势是其相对于网络参数的部分导数不需要计算网络激活函数的导数。我们没有使用一阶优化方法和背部传播的组合估算参数(以及最先进的),而是建议使用非平滑的一阶优化方法来利用新颖配方的特定结构。我们提出了几个数值结果,这些结果表明,与更常规的培训框架相比,这些训练方法可以很好地很好,甚至更适合于培训基于神经网络的分类器和具有稀疏编码的自动编码器。
We introduce a novel mathematical formulation for the training of feed-forward neural networks with (potentially non-smooth) proximal maps as activation functions. This formulation is based on Bregman distances and a key advantage is that its partial derivatives with respect to the network's parameters do not require the computation of derivatives of the network's activation functions. Instead of estimating the parameters with a combination of first-order optimisation method and back-propagation (as is the state-of-the-art), we propose the use of non-smooth first-order optimisation methods that exploit the specific structure of the novel formulation. We present several numerical results that demonstrate that these training approaches can be equally well or even better suited for the training of neural network-based classifiers and (denoising) autoencoders with sparse coding compared to more conventional training frameworks.