不是一个而是许多权衡：隐私与私人机器学习中的实用程序

论文标题

不是一个而是许多权衡：隐私与私人机器学习中的实用程序

Not one but many Tradeoffs: Privacy Vs. Utility in Differentially Private Machine Learning

论文作者

Zhao, Benjamin Zi Hao, Kaafar, Mohamed Ali, Kourtellis, Nicolas

论文摘要

数据持有人越来越多地寻求保护用户的隐私，同时仍最大程度地提高了他们具有高质量预测的机器模型的能力。在这项工作中，我们凭经验评估了差异隐私（DP）的各种实现，并衡量其抵御现实世界隐私攻击的能力，除了衡量其提供准确分类的核心目标。我们建立一个评估框架，以确保对这些实现中的每一个都进行公平评估。我们选择的DP实现在框架内的不同位置，在数据收集/发布时，在训练模型训练时或通过扰动学习的模型参数进行培训后，在框架内的不同位置增加了DP噪声。我们在一系列隐私预算和数据集中评估了每个实施，每个实现都提供了相同的数学隐私保证。通过测量模型对成员和属性推理的现实世界攻击的抵抗以及它们的分类准确性。我们确定哪些实现在隐私和公用事业之间提供了最理想的权衡。我们发现，给定数据集的类数不太可能影响隐私和公用事业权衡发生的何处。此外，在需要高隐私限制的情况下，与ML过程中添加的噪声相比，对输入培训数据的扰动并不能取得那么多的效果。

Data holders are increasingly seeking to protect their user's privacy, whilst still maximizing their ability to produce machine models with high quality predictions. In this work, we empirically evaluate various implementations of differential privacy (DP), and measure their ability to fend off real-world privacy attacks, in addition to measuring their core goal of providing accurate classifications. We establish an evaluation framework to ensure each of these implementations are fairly evaluated. Our selection of DP implementations add DP noise at different positions within the framework, either at the point of data collection/release, during updates while training of the model, or after training by perturbing learned model parameters. We evaluate each implementation across a range of privacy budgets, and datasets, each implementation providing the same mathematical privacy guarantees. By measuring the models' resistance to real world attacks of membership and attribute inference, and their classification accuracy. we determine which implementations provide the most desirable tradeoff between privacy and utility. We found that the number of classes of a given dataset is unlikely to influence where the privacy and utility tradeoff occurs. Additionally, in the scenario that high privacy constraints are required, perturbing input training data does not trade off as much utility, as compared to noise added later in the ML process.

下载PDF全文

下载文献需遵守相关版权规定

论文标题