寻找强大的概括度量

论文标题

寻找强大的概括度量

In Search of Robust Measures of Generalization

论文作者

Dziugaite, Gintare Karolina, Drouin, Alexandre, Neal, Brady, Rajkumar, Nitarshan, Caballero, Ethan, Wang, Linbo, Mitliagkas, Ioannis, Roy, Daniel M.

论文摘要

深度学习的主要科学挑战之一是解释概括，即社区现在训练网络以实现小培训错误的特定方式也导致来自同一人群的持有数据的小错误。人们普遍认为，一些最坏的理论（例如基于现代神经网络体系结构引起的一系列预测因素的VC维度的理论）无法解释经验表现。大量工作旨在缩小这一差距，主要是通过在概括错误，优化错误和多余的风险方面发展界限。但是，当经验评估时，这些边界中的大多数在数值上是空置的。为了关注概括范围，这项工作解决了如何通过经验评估这种界限的问题。江等。（2020）最近描述了一项大规模的经验研究，旨在发现边界/度量与概括之间的潜在因果关系。在他们的研究基础上，我们强调了他们提出的方法可以掩盖解释概括的概括措施的失败和成功。我们认为，应该在分布鲁棒性的框架内评估概括度量。

One of the principal scientific challenges in deep learning is explaining generalization, i.e., why the particular way the community now trains networks to achieve small training error also leads to small error on held-out data from the same population. It is widely appreciated that some worst-case theories -- such as those based on the VC dimension of the class of predictors induced by modern neural network architectures -- are unable to explain empirical performance. A large volume of work aims to close this gap, primarily by developing bounds on generalization error, optimization error, and excess risk. When evaluated empirically, however, most of these bounds are numerically vacuous. Focusing on generalization bounds, this work addresses the question of how to evaluate such bounds empirically. Jiang et al. (2020) recently described a large-scale empirical study aimed at uncovering potential causal relationships between bounds/measures and generalization. Building on their study, we highlight where their proposed methods can obscure failures and successes of generalization measures in explaining generalization. We argue that generalization measures should instead be evaluated within the framework of distributional robustness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题