论文标题

体重扩展:关于辍学和概括的新观点

Weight Expansion: A New Perspective on Dropout and Generalization

论文作者

Jin, Gaojie, Yi, Xinping, Yang, Pengfei, Zhang, Lijun, Schewe, Sven, Huang, Xiaowei

论文摘要

尽管已知辍学是一种成功的正规化技术,但仍缺乏对导致成功的机制的见解。我们介绍了\ emph {重量膨胀}的概念,这是由重量协方差矩阵的列或行载体跨越的平行线的签名体积增加,并表明重量膨胀是增加Pac-bayesian环境中概括的有效手段。我们提供了一个理论上的论点,即辍学会导致体重扩大,并为辍学与体重扩张之间的相关性提供广泛的经验支持。为了支持我们的假设,即可以将体重扩张视为增强的概括能力的\ emph {指示器},而不仅仅是仅仅是副产品,我们研究了实现重量扩张(sesp。\ contraction)的其他方法,并发现它们通常会导致增加(resp. \ \ \ \ \ \ \降低)的一般化能力。这表明辍学是一种有吸引力的正规器,因为它是一种用于获得体重扩展的计算便宜方法。这种洞察力证明了辍学者作为正规化器的作用,同时为确定正规化器铺平了道理,这些正规化器有望通过增加体重来改善概括。

While dropout is known to be a successful regularization technique, insights into the mechanisms that lead to this success are still lacking. We introduce the concept of \emph{weight expansion}, an increase in the signed volume of a parallelotope spanned by the column or row vectors of the weight covariance matrix, and show that weight expansion is an effective means of increasing the generalization in a PAC-Bayesian setting. We provide a theoretical argument that dropout leads to weight expansion and extensive empirical support for the correlation between dropout and weight expansion. To support our hypothesis that weight expansion can be regarded as an \emph{indicator} of the enhanced generalization capability endowed by dropout, and not just as a mere by-product, we have studied other methods that achieve weight expansion (resp.\ contraction), and found that they generally lead to an increased (resp.\ decreased) generalization ability. This suggests that dropout is an attractive regularizer, because it is a computationally cheap method for obtaining weight expansion. This insight justifies the role of dropout as a regularizer, while paving the way for identifying regularizers that promise improved generalization through weight expansion.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源