论文标题
朝向与混杂域适应的向后兼容数据
Towards Backwards-Compatible Data with Confounded Domain Adaptation
论文作者
论文摘要
大多数当前的域适应方法都解决了协变量偏移或标签移位,但在同时发生并彼此混淆的地方不适用。确实解释了这种混杂的域适应方法旨在调整协变量,以最佳预测其转移与协变量转移混淆的特定标签。在本文中,我们寻求将通用数据向后兼容。这将允许改编的协变量用于各种下游问题,包括先前存在的预测模型和数据分析任务。为此,我们考虑对广义标签转移(GLS)的修改,我们称之为混杂的转移。我们为这个问题提供了一个新的框架,基于最大程度地减少源和目标条件分布之间的预期差异,并根据可能的混杂因素进行调节。在此框架内,我们使用高斯反向kullback-leibler差异和最大平均差异提供具体的实现。最后,我们演示了我们关于合成和真实数据集的方法。
Most current domain adaptation methods address either covariate shift or label shift, but are not applicable where they occur simultaneously and are confounded with each other. Domain adaptation approaches which do account for such confounding are designed to adapt covariates to optimally predict a particular label whose shift is confounded with covariate shift. In this paper, we instead seek to achieve general-purpose data backwards compatibility. This would allow the adapted covariates to be used for a variety of downstream problems, including on pre-existing prediction models and on data analytics tasks. To do this we consider a modification of generalized label shift (GLS), which we call confounded shift. We present a novel framework for this problem, based on minimizing the expected divergence between the source and target conditional distributions, conditioning on possible confounders. Within this framework, we provide concrete implementations using the Gaussian reverse Kullback-Leibler divergence and the maximum mean discrepancy. Finally, we demonstrate our approach on synthetic and real datasets.