论文标题
主题的颗粒状差异隐私在联合学习中
Subject Granular Differential Privacy in Federated Learning
论文作者
论文摘要
本文考虑了在FL设置中的主题级别的隐私,其中一个受试者是一个人,其私人信息由限制在单个联邦用户中的几个数据项体现,或者是在多个联邦用户中分布的。我们提出了两种新算法,这些算法在本地为每个联邦用户执行主题级别的DP。我们的第一种称为LocalGroupDP的算法是在流行的DP-SGD算法中的组差分隐私的直接应用。我们的第二个算法是基于对参加培训迷你批次的受试者的层次梯度平均(HigradavgDP)的新颖概念。我们还显示,用户级别的本地差异隐私(LDP)自然可以保证主题级别的DP。我们观察到佛罗里达州受试者级别隐私损失的水平组成的问题 - 在整个联邦构成的单个用户造成的受试者级别的隐私损失。我们正式证明了对我们算法的主题级别DP保证,并显示了它们对模型效用损失的影响。我们对女权主义者和莎士比亚数据集的经验评估表明,LocalGroupDP在我们的算法中提供了最佳性能。但是,其模型实用程序落后于使用基于DP-SGD的算法训练的模型的模型,该算法提供了较弱的项目级别隐私保证。由于受试者采样部分和水平组成而导致的隐私损失放大仍然是模型实用程序的关键挑战。
This paper considers subject level privacy in the FL setting, where a subject is an individual whose private information is embodied by several data items either confined within a single federation user or distributed across multiple federation users. We propose two new algorithms that enforce subject level DP at each federation user locally. Our first algorithm, called LocalGroupDP, is a straightforward application of group differential privacy in the popular DP-SGD algorithm. Our second algorithm is based on a novel idea of hierarchical gradient averaging (HiGradAvgDP) for subjects participating in a training mini-batch. We also show that user level Local Differential Privacy (LDP) naturally guarantees subject level DP. We observe the problem of horizontal composition of subject level privacy loss in FL - subject level privacy loss incurred at individual users composes across the federation. We formally prove the subject level DP guarantee for our algorithms, and also show their effect on model utility loss. Our empirical evaluation on FEMNIST and Shakespeare datasets shows that LocalGroupDP delivers the best performance among our algorithms. However, its model utility lags behind that of models trained using a DP-SGD based algorithm that provides a weaker item level privacy guarantee. Privacy loss amplification due to subject sampling fractions and horizontal composition remain key challenges for model utility.