光谱正则化允许在组合空间上进行数据柔韧性学习

论文标题

光谱正则化允许在组合空间上进行数据柔韧性学习

Spectral Regularization Allows Data-frugal Learning over Combinatorial Spaces

论文作者

Aghazadeh, Amirali, Rajaraman, Nived, Tu, Tony, Ramchandran, Kannan

论文摘要

数据驱动的机器学习模型越来越多地用于生物学，化学和物理学的几个重要推理问题，这些问题需要对组合空间进行学习。最近的经验证据（例如，参见[1]，[2]，[3]）表明，当稀缺标记的数据时，将这种模型的光谱表示会提高其泛化能力。然而，尽管这些经验研究，何时以及如何实现光谱正则化的理论基础可以改善概括。在本文中，我们专注于学习伪树状的功能，并证明，通过L_1规范的经验平方平方误差正规化了学习功能的光谱转换的频谱变换，可以使损失格局重塑损失格局，并允许在有限的sec情况上进行数据保守的学习，并在学习者对实验人的经验误差上衡量了地面真相函数。在较弱的二次生长条件下，我们表明固定点也大约插入训练数据点，达到了统计上最佳的泛化性能。与我们的理论相辅相成，我们从经验上证明，与基线算法相比，在几个数据筛选现实世界中的基线算法中，正规化损失的运行梯度下降会导致更好的概括性能。

Data-driven machine learning models are being increasingly employed in several important inference problems in biology, chemistry, and physics which require learning over combinatorial spaces. Recent empirical evidence (see, e.g., [1], [2], [3]) suggests that regularizing the spectral representation of such models improves their generalization power when labeled data is scarce. However, despite these empirical studies, the theoretical underpinning of when and how spectral regularization enables improved generalization is poorly understood. In this paper, we focus on learning pseudo-Boolean functions and demonstrate that regularizing the empirical mean squared error by the L_1 norm of the spectral transform of the learned function reshapes the loss landscape and allows for data-frugal learning, under a restricted secant condition on the learner's empirical error measured against the ground truth function. Under a weaker quadratic growth condition, we show that stationary points which also approximately interpolate the training data points achieve statistically optimal generalization performance. Complementing our theory, we empirically demonstrate that running gradient descent on the regularized loss results in a better generalization performance compared to baseline algorithms in several data-scarce real-world problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题