良性过度拟合和嘈杂的功能

论文标题

良性过度拟合和嘈杂的功能

Benign Overfitting and Noisy Features

论文作者

Li, Zhu, Su, Weijie, Sejdinovic, Dino

论文摘要

现代机器学习通常在参数数量远高于数据点数量的制度中运行，却零训练损失却良好，从而与经典的偏见差异权衡矛盾。这种\ textit {良性过拟合}现象最近使用所谓的\ textit {double descent}曲线表征了风险会再次下降的风险（除了经典的U形学习曲线时，当参数的数量很小时）将参数的数量增加到一定的阈值之外。在本文中，我们检查了\ textit {良性过拟合}的条件，以随机特征（RF）模型，即在具有固定第一层权重的两层神经网络中。我们采用了随机功能的新观点，并证明\ textIt {良性过度拟合}是由于存在于此类功能中的噪声而产生的（数据可能已经存在于数据中并传播到该功能中，或者可以直接将其添加到该功能中），并且在该现象中扮演着重要的隐式正则作用。

Modern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off. This \textit{benign overfitting} phenomenon has recently been characterized using so called \textit{double descent} curves where the risk undergoes another descent (in addition to the classical U-shaped learning curve when the number of parameters is small) as we increase the number of parameters beyond a certain threshold. In this paper, we examine the conditions under which \textit{Benign Overfitting} occurs in the random feature (RF) models, i.e. in a two-layer neural network with fixed first layer weights. We adopt a new view of random feature and show that \textit{benign overfitting} arises due to the noise which resides in such features (the noise may already be present in the data and propagate to the features or it may be added by the user to the features directly) and plays an important implicit regularization role in the phenomenon.

下载PDF全文

下载文献需遵守相关版权规定

论文标题