论文标题

相关差异隐私:机器学习中的功能选择

Correlated Differential Privacy: Feature Selection in Machine Learning

论文作者

Zhang, Tao, Zhu, Tianqing, Xiong, Ping, Huo, Huan, Tari, Zahir, Zhou, Wanlei

论文摘要

机器学习中保留隐私是行业信息学的关键问题,因为用于行业培训的数据通常包含敏感信息。现有的私有机器学习算法尚未考虑数据相关性的影响,这可能会导致比工业应用中预期更多的隐私泄漏。例如,由于时间相关性或用户相关性,收集的用于流量监视的数据可能包含一些相关记录。为了填补这一空白,我们提出了一个相关方案,并考虑到机器学习任务中数据相关时隐私损失问题的差异性特征选择。 %拟议方案的关键是描述数据相关性和选择功能,从而导致整个数据集的数据相关性较小。提出的计划涉及五个步骤,目的是管理数据相关程度,保留隐私并支持预测结果中的准确性。通过这种方式,通过提出的功能选择方案减轻了数据相关性的影响,此外,保证了学习中数据相关的隐私问题。所提出的方法可以广泛用于在工业领域提供服务的机器学习算法。实验表明,与现有方案相比,提出的方案可以通过机器学习任务来产生更好的预测结果,并且数据查询的均方误差较少。

Privacy preserving in machine learning is a crucial issue in industry informatics since data used for training in industries usually contain sensitive information. Existing differentially private machine learning algorithms have not considered the impact of data correlation, which may lead to more privacy leakage than expected in industrial applications. For example, data collected for traffic monitoring may contain some correlated records due to temporal correlation or user correlation. To fill this gap, we propose a correlation reduction scheme with differentially private feature selection considering the issue of privacy loss when data have correlation in machine learning tasks. %The key to the proposed scheme is to describe the data correlation and select features which leads to less data correlation across the whole dataset. The proposed scheme involves five steps with the goal of managing the extent of data correlation, preserving the privacy, and supporting accuracy in the prediction results. In this way, the impact of data correlation is relieved with the proposed feature selection scheme, and moreover, the privacy issue of data correlation in learning is guaranteed. The proposed method can be widely used in machine learning algorithms which provide services in industrial areas. Experiments show that the proposed scheme can produce better prediction results with machine learning tasks and fewer mean square errors for data queries compared to existing schemes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源