超越了经典差异差异模型：比较统计方法的模拟研究，以估计状态级政策的有效性

论文标题

超越了经典差异差异模型：比较统计方法的模拟研究，以估计状态级政策的有效性

Moving beyond the classic difference-in-differences model: A simulation study comparing statistical methods for estimating effectiveness of state-level policies

论文作者

Griffin, Beth Ann, Schuler, Megan S., Stuart, Elizabeth A., Patrick, Stephen, McNeer, Elizabeth, Smart, Rosanna, Powell, David, Stein, Bradley D., Schell, Terry, Pacula, Rosalie L.

论文摘要

州级政策评估通常采用差异差异（DID）研究设计；然而，在此框架内，统计模型规范在整个研究中都有很大差异。这项模拟研究是由应用的状态级阿片类药物政策评估进行的，比较了传统上用于在一系列仿真条件下使用的双向固定效应模型多种变化的统计性能。尽管大多数线性模型都产生了最小的偏见，但经典线性双向固定效果的非线性模型和人口加权版本和线性GEE模型产生了相当大的偏见（60％至160％）。此外，在检查原始死亡计数时，在检查原油死亡率和负二项式模型时，线性AR模型在检查原始AR模型时将均方根误差最小化。在频繁的假设检验的背景下，许多模型产生了高I类错误率，并且正确拒绝零假设的率非常低（<10％），这引起了人们对政策有效性的虚假结论的担忧。在考虑跨模型的性能时，线性自回归模型在方向性偏差，均方根误差，I型误差和正确拒绝率方面是最佳的。这些发现突出了通常用于设计的传统统计模型的显着局限性，在阿片类药物政策研究中广泛使用的设计以及在国家政策评估中更广泛地使用。

State-level policy evaluations commonly employ a difference-in-differences (DID) study design; yet within this framework, statistical model specification varies notably across studies. Motivated by applied state-level opioid policy evaluations, this simulation study compares statistical performance of multiple variations of two-way fixed effect models traditionally used for DID under a range of simulation conditions. While most linear models resulted in minimal bias, non-linear models and population-weighted versions of classic linear two-way fixed effect and linear GEE models yielded considerable bias (60 to 160%). Further, root mean square error is minimized by linear AR models when examining crude mortality rates and by negative binomial models when examining raw death counts. In the context of frequentist hypothesis testing, many models yielded high Type I error rates and very low rates of correctly rejecting the null hypothesis (< 10%), raising concerns of spurious conclusions about policy effectiveness. When considering performance across models, the linear autoregressive models were optimal in terms of directional bias, root mean squared error, Type I error, and correct rejection rates. These findings highlight notable limitations of traditional statistical models commonly used for DID designs, designs widely used in opioid policy studies and in state policy evaluations more broadly.

下载PDF全文

下载文献需遵守相关版权规定

论文标题