论文标题

对随机数值线性代数算法的采样估计量的渐近分析

Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms

论文作者

Ma, Ping, Zhang, Xinlian, Xing, Xin, Ma, Jingyi, Mahoney, Michael W.

论文摘要

在过去几年中,随机数值线性代数(Randnla)算法的统计分析主要集中在其作为点估计器的性能上。但是,由于缺乏估计量的分布,这不足以进行统计推断,例如构建置信区间和假设检验。在本文中,我们开发了一种渐近分析,以导致最小二乘问题的Randnla采样估计量的分布。特别是,我们得出具有任意采样概率的一般抽样估计量的渐近分布。分析是在两个互补设置中进行的,即当感兴趣的目的是近似于完整的样本估计器或推断基础的地面真实模型参数。对于每种设置,我们表明采样估计量在轻度的规律性条件下渐变地正态分布。此外,在这两种情况下,采样估计量均无偏见。基于我们的渐近分析,我们使用两个标准,即渐近平方误差(AMSE)和预期的渐近平均平方误差(EAMSE),以识别最佳采样概率。这些最佳采样概率分布中的几个是文献新的,例如根杠杆采样器和预测器长度采样估计器。我们的理论结果阐明了杠杆在抽样过程中的作用,我们的经验结果证明了对现有方法的改善。

The statistical analysis of Randomized Numerical Linear Algebra (RandNLA) algorithms within the past few years has mostly focused on their performance as point estimators. However, this is insufficient for conducting statistical inference, e.g., constructing confidence intervals and hypothesis testing, since the distribution of the estimator is lacking. In this article, we develop an asymptotic analysis to derive the distribution of RandNLA sampling estimators for the least-squares problem. In particular, we derive the asymptotic distribution of a general sampling estimator with arbitrary sampling probabilities. The analysis is conducted in two complementary settings, i.e., when the objective of interest is to approximate the full sample estimator or is to infer the underlying ground truth model parameters. For each setting, we show that the sampling estimator is asymptotically normally distributed under mild regularity conditions. Moreover, the sampling estimator is asymptotically unbiased in both settings. Based on our asymptotic analysis, we use two criteria, the Asymptotic Mean Squared Error (AMSE) and the Expected Asymptotic Mean Squared Error (EAMSE), to identify optimal sampling probabilities. Several of these optimal sampling probability distributions are new to the literature, e.g., the root leverage sampling estimator and the predictor length sampling estimator. Our theoretical results clarify the role of leverage in the sampling process, and our empirical results demonstrate improvements over existing methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源