论文标题

基于私人树的私人重新描述挖掘

Differentially Private Tree-Based Redescription Mining

论文作者

Mihelčić, Matej, Miettinen, Pauli

论文摘要

差异隐私提供了强大的隐私形式,并允许保留数据集的大多数原始特征。利用这些好处需要一个人设计特定的差异私人数据分析算法。在这项工作中,我们介绍了三种基于树的算法,用于采矿记录,同时保留差异隐私。重新描述挖掘是一种探索性数据分析方法,用于在同一实体(例如医学患者的表型和基因型)之间找到两个观点之间的联系。它在许多领域中都有应用程序,包括某些领域,例如医疗保健信息学,需要保护隐私对数据的访问。我们的算法是第一个差异化的重新描述挖掘算法,我们通过实验表明,尽管差异隐私中固有的噪声,但即使在噪声通常具有更强效果的较小数据集中,它也可以返回可信赖的结果。

Differential privacy provides a strong form of privacy and allows preserving most of the original characteristics of the dataset. Utilizing these benefits requires one to design specific differentially private data analysis algorithms. In this work, we present three tree-based algorithms for mining redescriptions while preserving differential privacy. Redescription mining is an exploratory data analysis method for finding connections between two views over the same entities, such as phenotypes and genotypes of medical patients, for example. It has applications in many fields, including some, like health care informatics, where privacy-preserving access to data is desired. Our algorithms are the first differentially private redescription mining algorithms, and we show via experiments that, despite the inherent noise in differential privacy, it can return trustworthy results even in smaller datasets where noise typically has a stronger effect.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源