论文标题
使用本地敏感的哈希和私人集交叉路口,保留隐私的记录链接
Privacy-preserving record linkage using local sensitive hash and private set intersection
论文作者
论文摘要
数据存储库中存储的数据量每年增加。这使得在遵守隐私法规的同时,将不同数据集之间的记录链接到不同的数据集之间。地址或名称更改,甚至用于实体数据的不同拼写,都可以防止公司使用私人重复数据删除或记录链接解决方案,例如私有集合交叉点(PSI)。为此,我们提出了一个新的,有效的保留隐私记录链接(PPRL)协议,该协议结合了PSI和局部敏感的哈希(LSH)函数,并以线性时间运行。我们解释了隐私保证,即我们的协议通过在两个数据集上执行协议,并以$ 2^{20} $记录执行协议,以$ 11-45的$分钟,具体取决于网络设置。
The amount of data stored in data repositories increases every year. This makes it challenging to link records between different datasets across companies and even internally, while adhering to privacy regulations. Address or name changes, and even different spelling used for entity data, can prevent companies from using private deduplication or record-linking solutions such as private set intersection (PSI). To this end, we propose a new and efficient privacy-preserving record linkage (PPRL) protocol that combines PSI and local sensitive hash (LSH) functions, and runs in linear time. We explain the privacy guarantees that our protocol provides and demonstrate its practicality by executing the protocol over two datasets with $2^{20}$ records each, in $11-45$ minutes, depending on network settings.