论文标题
可推广的信息理论因果代表
Generalizable Information Theoretic Causal Representation
论文作者
论文摘要
有证据表明,在许多现实世界中,例如图像分类和推荐系统,代表性学习可以改善模型在多个下游任务上的性能。现有的学习方法依赖于在特征和下游任务(标签)之间建立相关性(或其代理),这通常会导致标签中包含原因,效果和虚假相关变量的表示形式。由于非毒物部分的不稳定,其普遍性可能会恶化。在本文中,我们建议通过根据我们的假设因果图将学习程序正规化学习程序来从观察数据中学习因果关系。优化涉及反事实损失,基于我们推断出因果关系启发性学习的理论保证,其样本复杂性和更好的概括能力。广泛的实验表明,在对抗性攻击和分配变化下,通过我们方法学到的因果表示训练的模型是可靠的。
It is evidence that representation learning can improve model's performance over multiple downstream tasks in many real-world scenarios, such as image classification and recommender systems. Existing learning approaches rely on establishing the correlation (or its proxy) between features and the downstream task (labels), which typically results in a representation containing cause, effect and spurious correlated variables of the label. Its generalizability may deteriorate because of the unstability of the non-causal parts. In this paper, we propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph. The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability. Extensive experiments show that the models trained on causal representations learned by our approach is robust under adversarial attacks and distribution shift.