基于聚类的隐私保护大数据使用模糊和匿名操作

论文标题

基于聚类的隐私保护大数据使用模糊和匿名操作

Clustering based Privacy Preserving of Big Data using Fuzzification and Anonymization Operation

论文作者

Khan, Saira, Iqbal, Khalid, Faizullah, Safi, Fahad, Muhammad, Ali, Jawad, Ahmed, Waqas

论文摘要

数据矿工将大数据用于可能包含敏感信息的分析目的。在此过程中，它为研究人员带来了一定的隐私挑战。现有的隐私保存方法使用不同的算法，在确保敏感数据的同时，导致数据重建的限制。本文介绍了基于聚类的隐私保护概率模型的大数据，以确保敏感信息。在我们的模型中，敏感信息在识别来自数据簇的敏感数据以修改或概括的敏感数据后得到确定。分析所得数据集以根据隐藏数据（重建导致的丢失数据）来计算模型的准确性水平。为了证明我们提出的模型的结果，进行了广泛的体验。基于聚类的隐私保护在大数据中以最小的扰动和成功重建的大数据保存，除了使用标准绩效评估指标外，还强调了我们模型的重要性。

Big Data is used by data miner for analysis purpose which may contain sensitive information. During the procedures it raises certain privacy challenges for researchers. The existing privacy preserving methods use different algorithms that results into limitation of data reconstruction while securing the sensitive data. This paper presents a clustering based privacy preservation probabilistic model of big data to secure sensitive information..model to attain minimum perturbation and maximum privacy. In our model, sensitive information is secured after identifying the sensitive data from data clusters to modify or generalize it.The resulting dataset is analysed to calculate the accuracy level of our model in terms of hidden data, lossed data as result of reconstruction. Extensive experiements are carried out in order to demonstrate the results of our proposed model. Clustering based Privacy preservation of individual data in big data with minimum perturbation and successful reconstruction highlights the significance of our model in addition to the use of standard performance evaluation measures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题