贝叶斯估计差异隐私

论文标题

贝叶斯估计差异隐私

Bayesian Estimation of Differential Privacy

论文作者

Zanella-Béguelin, Santiago, Wutschitz, Lukas, Tople, Shruti, Salem, Ahmed, Rühle, Victor, Paverd, Andrew, Naseri, Mohammad, Köpf, Boris, Jones, Daniel

论文摘要

诸如私人SGD之类的算法启用了具有正式隐私保证的培训机器学习模型。但是，这种算法在理论上保证的保护与实践中提供的保护之间存在差异。一系列新兴的工作经验估计了差异私人培训作为隐私预算$ \ varepsilon $用于培训模型的置信区间提供的保护。现有方法从置信区间从置信区间获得了置信区间，以置信为误报和假阴性攻击。不幸的是，使用这种方法获得$ε$的狭窄高信心间隔需要不切实际的样本量和训练与样品一样多。我们提出了一种新型的贝叶斯方法，可大大减少样本量，并适应和验证启发式方法，以每种训练的模型绘制多个样本。我们的贝叶斯方法利用了对差异隐私的假设测试解释，以从$ \ varepsilon $（不仅仅是置信区间）获得后部的后部，这是从误报和假阴性的成员推理攻击的共同后部。对于相同的样本量和信心，我们以$ \ varepsilon $ 40％的狭窄范围比先前的工作得出了置信区间。我们从仅标签DP适应的启发式方法可用于进一步减少最多2个数量级获得足够样品所需的训练模型数量。

Algorithms such as Differentially Private SGD enable training machine learning models with formal privacy guarantees. However, there is a discrepancy between the protection that such algorithms guarantee in theory and the protection they afford in practice. An emerging strand of work empirically estimates the protection afforded by differentially private training as a confidence interval for the privacy budget $\varepsilon$ spent on training a model. Existing approaches derive confidence intervals for $\varepsilon$ from confidence intervals for the false positive and false negative rates of membership inference attacks. Unfortunately, obtaining narrow high-confidence intervals for $ε$ using this method requires an impractically large sample size and training as many models as samples. We propose a novel Bayesian method that greatly reduces sample size, and adapt and validate a heuristic to draw more than one sample per trained model. Our Bayesian method exploits the hypothesis testing interpretation of differential privacy to obtain a posterior for $\varepsilon$ (not just a confidence interval) from the joint posterior of the false positive and false negative rates of membership inference attacks. For the same sample size and confidence, we derive confidence intervals for $\varepsilon$ around 40% narrower than prior work. The heuristic, which we adapt from label-only DP, can be used to further reduce the number of trained models needed to get enough samples by up to 2 orders of magnitude.

下载PDF全文

下载文献需遵守相关版权规定

论文标题