$ k $ randomized响应的混洗模型的紧密差异隐私保证

论文标题

$ k $ randomized响应的混洗模型的紧密差异隐私保证

Tight Differential Privacy Guarantees for the Shuffle Model with $k$-Randomized Response

论文作者

Biswas, Sayan, Jung, Kangsoo, Palamidessi, Catuscia

论文摘要

大多数差异化的私有（DP）算法假设一个中心模型，其中可靠的第三方将噪声插入到数据集上的查询中，或者在本地模型上插入噪声，其中用户本地扰动其数据。但是，中心模型通过单个故障点很容易受到伤害，在局部模型中，数据的效用显着恶化。最近提出的洗牌模型是中央和本地范式之间的一个中间框架，用户将其本地私有化的数据发送到邮件被改组的服务器，从而在私有化消息与相应用户之间的链接中脱颖而出，从而在隐私和效用之间提供了比本地模型更好的私密性和效用之间的折衷，而没有增加噪声，因此其私密性会增加更多的噪声。在本文中，我们从理论上得出了具有$ k $ andomized响应局部随机化器的Shuffle模型的最严格的DP保证限制。在此，我们专注于用于直方图查询的洗牌模型的实用性。利用矩阵反转方法，该方法用于近似于$ k $ -RR机制产生的经验分布的原始分布，我们将Shuffle模型产生的直方图降低，以评估所产生的直方图与真实的直方图的总变化距离，我们将其视为隐私机制的实用性的量度。我们对综合数据和真实数据进行实验，以将洗牌模型的隐私性 - 实用性权衡与通过向每个垃圾箱添加最新的高斯噪声添加最新的高斯噪声来比较私有化模型的私密性权衡。尽管实验结果与有利于中心模型的文献保持一致，但我们看到，中央和洗牌模型之间统计公用事业的差异非常小，表明它们在同一水平的DP下几乎是可比的。

Most differentially private (DP) algorithms assume a central model in which a reliable third party inserts noise to queries made on datasets, or a local model where the users locally perturb their data. However, the central model is vulnerable via a single point of failure, and in the local model, the utility of the data deteriorates significantly. The recently proposed shuffle model is an intermediate framework between the central and the local paradigms where the users send their locally privatized data to a server where messages are shuffled, effacing the link between a privatized message and the corresponding user, giving a better trade-off between privacy and utility than the local model, as its privacy gets amplified without adding more noise. In this paper, we theoretically derive the strictest known bound for DP guarantee for the shuffle models with $k$-Randomized Response local randomizers. There on, we focus on the utility of the shuffle model for histogram queries. Leveraging on the matrix inversion method, which is used to approximate the original distribution from the empirical one produced by the $k$-RR mechanism, we de-noise the histogram produced by the shuffle model to evaluate the total variation distance of the resulting histogram from the true one, which we regard as the measure of utility of the privacy mechanism. We perform experiments on both synthetic and real data to compare the privacy-utility trade-off of the shuffle model with that of the central one privatized by adding the state-of-the-art Gaussian noise to each bin. Although the experimental results stay consistent with the literature that favour the central model, we see that, the difference in statistical utilities between the central and the shuffle models is very small, showing that they are almost comparable under the same level of DP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题