论文标题

线性瓶颈网络及其过渡到多线性的注释

A note on Linear Bottleneck networks and their Transition to Multilinearity

论文作者

Zhu, Libin, Pandit, Parthe, Belkin, Mikhail

论文摘要

随着宽度的增长,随着宽度的增长,随机初始化的宽神经网络过渡到重量的线性函数,在初始化周围的半径$ o(1)$中。该结果的必要条件是,网络的所有层都足够宽,即所有宽度趋于无穷大。但是,当这种无限的宽度假设违反时,向线性的过渡会分解。在这项工作中,我们表明,具有瓶颈层的线性网络学习重量的双线性功能,在初始化周围的半径$ O(1)$中。通常,对于$ B-1 $瓶装层,该网络是$ b $ tem $ b $重量功能。重要的是,该度仅取决于瓶颈的数量,而不取决于网络的总深度。

Randomly initialized wide neural networks transition to linear functions of weights as the width grows, in a ball of radius $O(1)$ around initialization. A necessary condition for this result is that all layers of the network are wide enough, i.e., all widths tend to infinity. However, the transition to linearity breaks down when this infinite width assumption is violated. In this work we show that linear networks with a bottleneck layer learn bilinear functions of the weights, in a ball of radius $O(1)$ around initialization. In general, for $B-1$ bottleneck layers, the network is a degree $B$ multilinear function of weights. Importantly, the degree only depends on the number of bottlenecks and not the total depth of the network.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源