论文标题

在模型识别和主要组件回归的样本外预测:合成控制的应用

On Model Identification and Out-of-Sample Prediction of Principal Component Regression: Applications to Synthetic Controls

论文作者

Agarwal, Anish, Shah, Devavrat, Shen, Dennis

论文摘要

我们在具有固定设计的高维错误设置中分析了主成分回归(PCR)。在适当的条件下,我们表明PCR始终用最小$ \ ell_2 $ -norm识别唯一模型。这些结果使我们能够建立非征服的样本外预测,可以确保提高最著名的速率。在分析过程中,我们在样本外和范围的协变量之间引入了天然的线性代数条件,这使我们能够避免针对样本外预测的分布假设。我们的模拟说明了即使在协变量的转变下,这种条件对于概括的重要性。因此,我们构建了一个假设检验,以检查何时在实践中保持这种情况。作为副产品,我们的结果还为合成控制文献带来了新的结果,这是政策评估的主要方法。据我们所知,在固定设计设置的预测中,在高维错误和合成控制文献中都难以捉摸。

We analyze principal component regression (PCR) in a high-dimensional error-in-variables setting with fixed design. Under suitable conditions, we show that PCR consistently identifies the unique model with minimum $\ell_2$-norm. These results enable us to establish non-asymptotic out-of-sample prediction guarantees that improve upon the best known rates. In the course of our analysis, we introduce a natural linear algebraic condition between the in- and out-of-sample covariates, which allows us to avoid distributional assumptions for out-of-sample predictions. Our simulations illustrate the importance of this condition for generalization, even under covariate shifts. Accordingly, we construct a hypothesis test to check when this conditions holds in practice. As a byproduct, our results also lead to novel results for the synthetic controls literature, a leading approach for policy evaluation. To the best of our knowledge, our prediction guarantees for the fixed design setting have been elusive in both the high-dimensional error-in-variables and synthetic controls literatures.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源