论文标题
在因果推理的线性结构方程模型中可识别可识别性
Robust Identifiability in Linear Structural Equation Models of Causal Inference
论文作者
论文摘要
在这项工作中,我们考虑了线性结构方程模型(LSEMS)中观察数据中鲁棒参数估计的问题。 LSEMS是一种流行且研究丰富的模型,用于推断自然和社会科学中的因果关系。与LSEM相关的主要问题之一是从观察数据中恢复模型参数。在LSEM和模型参数的各种条件下,先前的工作提供了有效的算法以恢复参数。但是,这些结果通常与通用可识别性有关。实际上,通用可识别性还不够,我们需要可靠的可识别性:观察数据的小变化不应大量影响参数。强大的可识别性受到了较少的关注,并且仍然了解得很糟糕。 Sankararaman等。 (2019年)最近为参数提供了一系列足够的条件,在这些参数下可行性可行性。但是,他们的工作局限性是他们的结果仅适用于一小部分LSEM,称为``无弓路径''。在这项工作中,我们大大扩展了他们的工作沿多个维度。首先,对于大型且研究良好的LSEM类,即``无弓形''模型,我们为可识别性可识别性的模型参数提供了足够的条件,从而消除了先前工作所需的路径的限制。然后,我们证明这种充分的条件具有很高的概率,这意味着对于大量参数可识别可识别性,并且对于此类参数,现有算法已经实现了可靠的可识别性。最后,我们在模拟和实际数据集上验证了结果。
In this work, we consider the problem of robust parameter estimation from observational data in the context of linear structural equation models (LSEMs). LSEMs are a popular and well-studied class of models for inferring causality in the natural and social sciences. One of the main problems related to LSEMs is to recover the model parameters from the observational data. Under various conditions on LSEMs and the model parameters the prior work provides efficient algorithms to recover the parameters. However, these results are often about generic identifiability. In practice, generic identifiability is not sufficient and we need robust identifiability: small changes in the observational data should not affect the parameters by a huge amount. Robust identifiability has received far less attention and remains poorly understood. Sankararaman et al. (2019) recently provided a set of sufficient conditions on parameters under which robust identifiability is feasible. However, a limitation of their work is that their results only apply to a small sub-class of LSEMs, called ``bow-free paths.'' In this work, we significantly extend their work along multiple dimensions. First, for a large and well-studied class of LSEMs, namely ``bow free'' models, we provide a sufficient condition on model parameters under which robust identifiability holds, thereby removing the restriction of paths required by prior work. We then show that this sufficient condition holds with high probability which implies that for a large set of parameters robust identifiability holds and that for such parameters, existing algorithms already achieve robust identifiability. Finally, we validate our results on both simulated and real-world datasets.