论文标题

迈向数据词素依赖性的概括:关于过度参数化的案例研究

Towards Data-Algorithm Dependent Generalization: a Case Study on Overparameterized Linear Regression

论文作者

Xu, Jing, Teng, Jiaye, Yuan, Yang, Yao, Andrew Chi-Chih

论文摘要

机器学习中的主要开放问题之一是表征过度参数化的制度中的概括,即使对于过度参数化的线性回归,大多数传统的概括界限也变得不一致。在许多情况下,这种失败可以归因于掩盖训练算法与基础数据分布之间的关键相互作用。本文表明,应以数据相关和算法相关的方式分析过度参数化模型的概括行为。为了进行正式表征,我们引入了一个称为数据词素兼容性的概念,该概念考虑了整个数据依赖数据培训轨迹的概括行为,而不是传统的最后近期分析。我们通过研究以梯度下降解决过度参数的线性回归的设置来验证我们的主张。具体而言,我们执行与数据相关的轨迹分析,并在这种情况下得出足够的兼容性条件。我们的理论结果表明,如果我们提前停止迭代术,概括可以比以前的最后近期分析对问题实例的限制明显较弱。

One of the major open problems in machine learning is to characterize generalization in the overparameterized regime, where most traditional generalization bounds become inconsistent even for overparameterized linear regression. In many scenarios, this failure can be attributed to obscuring the crucial interplay between the training algorithm and the underlying data distribution. This paper demonstrate that the generalization behavior of overparameterized model should be analyzed in a both data-relevant and algorithm-relevant manner. To make a formal characterization, We introduce a notion called data-algorithm compatibility, which considers the generalization behavior of the entire data-dependent training trajectory, instead of traditional last-iterate analysis. We validate our claim by studying the setting of solving overparameterized linear regression with gradient descent. Specifically, we perform a data-dependent trajectory analysis and derive a sufficient condition for compatibility in such a setting. Our theoretical results demonstrate that if we take early stopping iterates into consideration, generalization can hold with significantly weaker restrictions on the problem instance than the previous last-iterate analysis.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源