论文标题

使用Repro样品的高维线性回归中模型和系数的有限样品推断

Finite- and Large- Sample Inference for Model and Coefficients in High-dimensional Linear Regression with Repro Samples

论文作者

Wang, Peng, Xie, Min-Ge, Zhang, Linjun

论文摘要

在本文中,我们提出了一种新的有效的基于模拟的方法,以对高维线性回归模型进行有限样本推理。这种方法是在所谓的repro样品框架下开发的,在该框架中,我们通过模仿数据的采样机制来创建和研究人工样本的行为来进行统计推断。我们获得(a)对应于非零系数的真实模型,(b)单个或任何收集回归系数的置信集,以及(c)共同的模型和回归系数。我们还扩展了对回归系数功能的推断的方法。所提出的方法填补了高维回归文献中的两个主要空白:(1)缺乏解决模型选择不确定性的有效方法并为基本的真实模型提供有效的推断; (2)缺乏保证有限样本表现的有效推理方法。我们提供有限样本和渐近结果,以从理论上保证所提出的方法的性能。此外,我们的数值结果表明,所提出的方法是有效的,并且比现有的最新方法(例如借记下和自举方法)实现了更好的覆盖范围。

In this paper, we present a new and effective simulation-based approach to conduct both finite- and large-sample inference for high-dimensional linear regression models. This approach is developed under the so-called repro samples framework, in which we conduct statistical inference by creating and studying the behavior of artificial samples that are obtained by mimicking the sampling mechanism of the data. We obtain confidence sets for (a) the true model corresponding to the nonzero coefficients, (b) a single or any collection of regression coefficients, and (c) both the model and regression coefficients jointly. We also extend our approaches to drawing inferences on functions of the regression coefficients. The proposed approach fills in two major gaps in the high-dimensional regression literature: (1) lack of effective approaches to address model selection uncertainty and provide valid inference for the underlying true model; (2) lack of effective inference approaches that guarantee finite-sample performances. We provide both finite-sample and asymptotic results to theoretically guarantee the performances of the proposed methods. In addition, our numerical results demonstrate that the proposed methods are valid and achieve better coverage with smaller confidence sets than the existing state-of-art approaches, such as debiasing and bootstrap approaches.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源