通过不变表示的PAC概括

论文标题

通过不变表示的PAC概括

PAC Generalization via Invariant Representations

论文作者

Parulekar, Advait, Shanmugam, Karthikeyan, Shakkottai, Sanjay

论文摘要

在使用不同的培训环境呈现时，获取机器学习任务的可通用解决方案的一种方法是找到数据的\ textit {不变表示}。这些是协变量的表示形式，以至于表示形式的最佳模型在培训环境之间是不变的。在线性结构方程模型（SEM）的背景下，不变表示可能使我们能够以分布范围的保证（即对SEM中的干预措施都有牢固的模型学习模型。为了解决{\ em有限示例}设置中不变的表示问题，我们考虑$ε$ - appproximate不变性的概念。我们研究以下问题：如果代表性相对于给定数量的培训干预措施大致不变，那么在更大的看不见的SEMS集合中，它是否会继续大致不变？这种更大的SEM集合是通过参数化的干预措施来生成的。受PAC学习的启发，我们获得了有限样本的分布概括，保证了近似不变性，该概述\ textit {probilistically ventery}，而没有忠实的假设。我们的结果表明，当干预站点仅限于恒定尺寸的子集的恒定界限节点的恒定子集时，界限不会在环境维度上扩展。我们还展示了如何将结果扩展到结合潜在变量的线性间接观察模型。

One method for obtaining generalizable solutions to machine learning tasks when presented with diverse training environments is to find \textit{invariant representations} of the data. These are representations of the covariates such that the best model on top of the representation is invariant across training environments. In the context of linear Structural Equation Models (SEMs), invariant representations might allow us to learn models with out-of-distribution guarantees, i.e., models that are robust to interventions in the SEM. To address the invariant representation problem in a {\em finite sample} setting, we consider the notion of $ε$-approximate invariance. We study the following question: If a representation is approximately invariant with respect to a given number of training interventions, will it continue to be approximately invariant on a larger collection of unseen SEMs? This larger collection of SEMs is generated through a parameterized family of interventions. Inspired by PAC learning, we obtain finite-sample out-of-distribution generalization guarantees for approximate invariance that holds \textit{probabilistically} over a family of linear SEMs without faithfulness assumptions. Our results show bounds that do not scale in ambient dimension when intervention sites are restricted to lie in a constant size subset of in-degree bounded nodes. We also show how to extend our results to a linear indirect observation model that incorporates latent variables.

下载PDF全文

下载文献需遵守相关版权规定

论文标题