论文标题
随机Langevin差分包含在机器学习中的应用
Stochastic Langevin Differential Inclusions with Applications to Machine Learning
论文作者
论文摘要
Langevin-diffusion形式的随机微分方程在贝叶斯采样算法中的基本作用和在机器学习中的优化都受到了极大的关注。在后者中,它们是训练过度参数化模型中随机梯度流的概念模型。但是,文献通常假定电势的平滑度,其梯度是漂移项。然而,存在许多问题,而潜在功能并非持续不断地差异,因此漂移并不是lipschitz到处都是。在回归问题中,强大的损失和整流的线性单位来说明这一点。在本文中,我们在适合机器学习设置的假设下展示了有关Langevin型随机差异夹杂物的流动和渐近特性的一些基本结果。特别是,我们显示了溶液的强烈存在,以及规范自由能功能的渐近最小化。
Stochastic differential equations of Langevin-diffusion form have received significant attention, thanks to their foundational role in both Bayesian sampling algorithms and optimization in machine learning. In the latter, they serve as a conceptual model of the stochastic gradient flow in training over-parameterized models. However, the literature typically assumes smoothness of the potential, whose gradient is the drift term. Nevertheless, there are many problems for which the potential function is not continuously differentiable, and hence the drift is not Lipschitz continuous everywhere. This is exemplified by robust losses and Rectified Linear Units in regression problems. In this paper, we show some foundational results regarding the flow and asymptotic properties of Langevin-type Stochastic Differential Inclusions under assumptions appropriate to the machine-learning settings. In particular, we show strong existence of the solution, as well as an asymptotic minimization of the canonical free-energy functional.