论文标题

平均因果效应估计的足够尺寸降低

Sufficient Dimension Reduction for Average Causal Effect Estimation

论文作者

Cheng, Debo, Li, Jiuyong, Liu, Lin, Liu, Jixue

论文摘要

拥有大量协变量可能会对因果效应估计的质量产生负面影响,因为当协变量数量相对于可用的样本而言,混淆调整变得不可靠。倾向得分是处理大量协变量集的一种常见方法,但是大量协变量也挑战了倾向得分估计的准确性(通常是通过逻辑回归完成)。在本文中,我们证明可以将大量的协变量组减少为较低的维度表示,从而捕获了完整的信息以进行因果效应估计的调整。理论结果可实现有效的数据驱动算法来进行因果效应估计。我们开发了一种算法,该算法采用监督的内核降低方法来寻找原始协变量的较低维表示,然后利用在减少的协变量空间中使用最近的邻居匹配,以估算反事实结果,以避免大尺寸的协方差式设定问题。在两个半合成和三个现实世界数据集上评估了所提出的算法,结果证明了该算法的有效性。

Having a large number of covariates can have a negative impact on the quality of causal effect estimation since confounding adjustment becomes unreliable when the number of covariates is large relative to the samples available. Propensity score is a common way to deal with a large covariate set, but the accuracy of propensity score estimation (normally done by logistic regression) is also challenged by large number of covariates. In this paper, we prove that a large covariate set can be reduced to a lower dimensional representation which captures the complete information for adjustment in causal effect estimation. The theoretical result enables effective data-driven algorithms for causal effect estimation. We develop an algorithm which employs a supervised kernel dimension reduction method to search for a lower dimensional representation for the original covariates, and then utilizes nearest neighbor matching in the reduced covariate space to impute the counterfactual outcomes to avoid large-sized covariate set problem. The proposed algorithm is evaluated on two semi-synthetic and three real-world datasets and the results have demonstrated the effectiveness of the algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源