论文标题
线性回归以中心度度量
Linear Regression with Centrality Measures
论文作者
论文摘要
本文研究了网络数据稀疏时,线性回归对中心度度量的特性 - 也就是说,当代理的试剂比每个代理的链接多得多,并且用误差测量它们时。我们在这种情况下做出了三个贡献:(1)我们表明,OLS估计器在稀疏性下可能会变得不一致,并表征发生这种情况的阈值,并且没有测量误差。该阈值取决于所使用的中心度度量。具体而言,对特征向量的回归对稀疏性的稳健性不如程度和扩散。 (2)我们在测量误差和稀疏性下开发了OLS估计量的分布理论,发现OLS估计量即使在一致的情况下也会受到渐近偏差的影响。此外,相对于它们的差异,偏见可能是较大的,因此偏置校正对于推论是必要的。 (3)我们提出了具有稀疏嘈杂网络的OLS的新型偏差校正和推理方法。模拟证据表明,我们的理论和方法表现良好,尤其是在通常的OLS估计器和异性抗性/稳健性t检验的情况下。最后,我们在受DE Weerdt和Deacon(2006)启发的应用程序中证明了结果的实用性,其中我们考虑在坦桑尼亚Nyakatoke的消费平滑和社会保险。
This paper studies the properties of linear regression on centrality measures when network data is sparse -- that is, when there are many more agents than links per agent -- and when they are measured with error. We make three contributions in this setting: (1) We show that OLS estimators can become inconsistent under sparsity and characterize the threshold at which this occurs, with and without measurement error. This threshold depends on the centrality measure used. Specifically, regression on eigenvector is less robust to sparsity than on degree and diffusion. (2) We develop distributional theory for OLS estimators under measurement error and sparsity, finding that OLS estimators are subject to asymptotic bias even when they are consistent. Moreover, bias can be large relative to their variances, so that bias correction is necessary for inference. (3) We propose novel bias correction and inference methods for OLS with sparse noisy networks. Simulation evidence suggests that our theory and methods perform well, particularly in settings where the usual OLS estimators and heteroskedasticity-consistent/robust t-tests are deficient. Finally, we demonstrate the utility of our results in an application inspired by De Weerdt and Deacon (2006), in which we consider consumption smoothing and social insurance in Nyakatoke, Tanzania.