论文标题
自适应和遗忘的随机子空间方法,用于高维优化:尖锐的分析和下限
Adaptive and Oblivious Randomized Subspace Methods for High-Dimensional Optimization: Sharp Analysis and Lower Bounds
论文作者
论文摘要
我们根据对随机子空间的限制,提出了新型的随机优化方法,以解决高维凸问题。我们考虑忽略和数据自适应子空间,并通过凸双重性和Fenchel偶联物研究其近似特性。可以通过采样相关的随机矩阵来生成合适的自适应子空间,其二阶统计信息反映了输入数据。我们说明,自适应策略可以显着胜过标准的遗漏抽样方法,该方法在最近的文献中广泛使用。我们表明,随机近似值的相对误差可以根据数据矩阵和双切锥的高斯宽度的频谱而最佳地进行紧密表征。我们基于度量集中和FANO的不平等,开发出优化和统计误差度量的下限。然后,我们通过不同光谱衰减曲线的数据矩阵呈现我们的理论的后果。实验结果表明,所提出的方法可以在各种机器学习和优化问题中进行显着速度,包括逻辑回归,内核分类,具有随机卷积层和带有整流线性单元的浅神经网络。
We propose novel randomized optimization methods for high-dimensional convex problems based on restrictions of variables to random subspaces. We consider oblivious and data-adaptive subspaces and study their approximation properties via convex duality and Fenchel conjugates. A suitable adaptive subspace can be generated by sampling a correlated random matrix whose second order statistics mirror the input data. We illustrate that the adaptive strategy can significantly outperform the standard oblivious sampling method, which is widely used in the recent literature. We show that the relative error of the randomized approximations can be tightly characterized in terms of the spectrum of the data matrix and Gaussian width of the dual tangent cone at optimum. We develop lower bounds for both optimization and statistical error measures based on concentration of measure and Fano's inequality. We then present the consequences of our theory with data matrices of varying spectral decay profiles. Experimental results show that the proposed approach enables significant speed ups in a wide variety of machine learning and optimization problems including logistic regression, kernel classification with random convolution layers and shallow neural networks with rectified linear units.