论文标题
在机器学习中研究凸问题和其他隐性配方问题的一系列量度框架
A Concentration of Measure Framework to study convex problems and other implicit formulation problems in machine learning
论文作者
论文摘要
本文提供了一个框架,以显示解决方案的浓度$ y^*$以最大程度地减少问题,其中目标函数$ ϕ(x)(y)$取决于某些随机矢量$ x $满足度量假设的浓度。更确切地说,凸问题转化为一个承包固定点方程,该方程确保浓度从$ x $传输到$ y^*$。表征许多通过隐式方程(例如逻辑回归,套索,增强等)定义的许多机器学习算法的核心兴趣。基于我们的框架,当$ x =(x_1,\ ldots,x_n)$是独立列的数据矩阵时,我们提供了解决方案$ y^*$的第一瞬间的精确估计。这允许描述各种机器学习分类器的行为和性能(例如,概括误差)。
This paper provides a framework to show the concentration of solutions $Y^*$ to convex minimizing problem where the objective function $ϕ(X)(Y)$ depends on some random vector $X$ satisfying concentration of measure hypotheses. More precisely, the convex problem translates into a contractive fixed point equation that ensure the transmission of the concentration from $X$ to $Y^*$. This result is of central interest to characterize many machine learning algorithms which are defined through implicit equations (e.g., logistic regression, lasso, boosting, etc.). Based on our framework, we provide precise estimations for the first moments of the solution $Y^*$, when $X= (x_1,\ldots, x_n)$ is a data matrix of independent columns and $ϕ(X)(y)$ writes as a sum $\frac{1}{n}\sum_{i=1}^n h_i(x_i^TY)$. That allows to describe the behavior and performance (e.g., generalization error) of a wide variety of machine learning classifiers.