统计上最佳的一阶算法：通过正交化的证明

论文标题

统计上最佳的一阶算法：通过正交化的证明

Statistically Optimal First Order Algorithms: A Proof via Orthogonalization

论文作者

Montanari, Andrea, Wu, Yuchen

论文摘要

我们考虑了一类统计估计问题，其中我们将为我们提供一个随机数据矩阵$ {\ boldsymbol x} \ in {\ Mathbb r}^{n \ times d} $（n \ times d} $（可能有些标签$ {\ boldsymbol y} \ in {\ mathbb r}^n $），并且可以估计cofter a cofor a cofor a cofor和θ} \ in {\ Mathbb r}^d $（或可能是恒定数量的向量）。特殊情况包括在广义线性模型中（例如稀疏回归）中的低级矩阵估计和正则估计。一阶方法通过将当前估计值乘以$ {\ boldsymbol x} $或其转置。示例包括梯度下降或其加速变体。 Wu的Montanari Celentano证明，对于任何恒定数量的迭代（矩阵矢量乘法），最佳一阶算法是一种特定的近似消息传递算法（称为`bayes amp'）。该估计器的错误可以在高维渐近造成$ n，d \ to \ infty $，$ n/d \toδ$中表征，并为任何一阶算法的估计误差提供了下限。在这里，我们提供了相同结果的更简单的证明，并将其推广到更广泛的数据分布和一阶算法，包括具有不可分割的非线性的算法。最重要的是，新的证明技术不需要构建等效的树结构估计问题，因此易受更广泛的应用范围。

We consider a class of statistical estimation problems in which we are given a random data matrix ${\boldsymbol X}\in {\mathbb R}^{n\times d}$ (and possibly some labels ${\boldsymbol y}\in{\mathbb R}^n$) and would like to estimate a coefficient vector ${\boldsymbol θ}\in{\mathbb R}^d$ (or possibly a constant number of such vectors). Special cases include low-rank matrix estimation and regularized estimation in generalized linear models (e.g., sparse regression). First order methods proceed by iteratively multiplying current estimates by ${\boldsymbol X}$ or its transpose. Examples include gradient descent or its accelerated variants. Celentano, Montanari, Wu proved that for any constant number of iterations (matrix vector multiplications), the optimal first order algorithm is a specific approximate message passing algorithm (known as `Bayes AMP'). The error of this estimator can be characterized in the high-dimensional asymptotics $n,d\to\infty$, $n/d\toδ$, and provides a lower bound to the estimation error of any first order algorithm. Here we present a simpler proof of the same result, and generalize it to broader classes of data distributions and of first order algorithms, including algorithms with non-separable nonlinearities. Most importantly, the new proof technique does not require to construct an equivalent tree-structured estimation problem, and is therefore susceptible of a broader range of applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题