论文标题

两阶段回归设置中三明治方差估算的实际考虑因素

Practical considerations for sandwich variance estimation in two-stage regression settings

论文作者

Boe, Lillian A., Lumley, Thomas, Shaw, Pamela A.

论文摘要

我们提出了一种在两阶段回归模型设置中计算三明治方差估计器的实用方法。作为两阶段回归的激励示例,我们考虑回归校准,这是一种解决协变量测量误差的流行方法。三明治方差方法很少在回归校准中应用,尽管它需要比流行的重新采样方法更少的计算时间来进行方差估计,尤其是引导程序。这可能是由于需要专门的统计编码。在实践中,经常采用使用WALD置信区间的简单引导方法,但是这种方法可以产生无法达到标称覆盖水平的置信区间。我们首先概述计算三明治方差估计器所需的步骤。然后,我们在R中开发了一种方便的计算方法,以进行三明治方差估计,该方法利用标准回归模型输出和现有的R函数,可以在简单的随机样本或复杂的调查设计的情况下应用。我们使用仿真研究将三明治的性能与两个数据设置的重采样差异方法进行比较。最后,我们进一步比较了妇女健康计划(WHI)和西班牙裔社区健康研究/拉丁美洲人(HCHS/SOL)的数据示例的这两种差异估计方法。

We present a practical approach for computing the sandwich variance estimator in two-stage regression model settings. As a motivating example for two-stage regression, we consider regression calibration, a popular approach for addressing covariate measurement error. The sandwich variance approach has been rarely applied in regression calibration, despite that it requires less computation time than popular resampling approaches for variance estimation, specifically the bootstrap. This is likely due to requiring specialized statistical coding. In practice, a simple bootstrap approach with Wald confidence intervals is often applied, but this approach can yield confidence intervals that do not achieve the nominal coverage level. We first outline the steps needed to compute the sandwich variance estimator. We then develop a convenient method of computation in R for sandwich variance estimation, which leverages standard regression model outputs and existing R functions and can be applied in the case of a simple random sample or complex survey design. We use a simulation study to compare the performance of the sandwich to a resampling variance approach for both data settings. Finally, we further compare these two variance estimation approaches for data examples from the Women's Health Initiative (WHI) and Hispanic Community Health Study/Study of Latinos (HCHS/SOL).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源