论文标题
PMPL:与特权方的强大多方学习框架
pMPL: A Robust Multi-Party Learning Framework with a Privileged Party
论文作者
论文摘要
为了在保护原始数据的隐私的同时在多方之间进行机器学习,基于安全多方计算(简称MPL)的隐私机器学习一直是最近的热点。 MPL的配置通常遵循点对点体系结构,每个方都有相同的机会揭示输出结果。但是,典型的业务场景通常遵循分层体系结构,其中一个强大的(通常具有特权的政党)领导机器学习的任务。即使其他助理方相互勾结,也只有特权政党才能揭示最终模型。甚至需要避免流产的机器学习,以确保计划的截止日期和/或在助理方退出的一部分撤离时保存使用的计算资源。 在上述情况下,我们提出了PMPL,这是一个具有特权部分的强大MPL框架}。 PMPL支持半honest设置的三方训练。通过为特权政党设定替代股份,PMPL可以强大地容忍在培训期间退出的其余两个政党之一。通过上述设置,我们根据向量空间秘密共享设计了一系列有效的协议,以弥合向量空间秘密共享和机器学习之间的差距。最后,实验结果表明,当我们将其与最先进的MPL框架进行比较时,PMPL的性能是有希望的。尤其是在LAN设置中,PMPL的$ 16 \ times $和$ 5 \ times $ $ $ $ $ $ $ $比TF-conteded(以ABY3为后端框架)分别用于线性回归和逻辑回归。此外,MNIST数据集的线性回归,逻辑回归和BP神经网络的训练模型的准确性分别可以达到97%,99%和96%。
In order to perform machine learning among multiple parties while protecting the privacy of raw data, privacy-preserving machine learning based on secure multi-party computation (MPL for short) has been a hot spot in recent. The configuration of MPL usually follows the peer-to-peer architecture, where each party has the same chance to reveal the output result. However, typical business scenarios often follow a hierarchical architecture where a powerful, usually privileged party, leads the tasks of machine learning. Only the privileged party can reveal the final model even if other assistant parties collude with each other. It is even required to avoid the abort of machine learning to ensure the scheduled deadlines and/or save used computing resources when part of assistant parties drop out. Motivated by the above scenarios, we propose pMPL, a robust MPL framework with a privileged part}. pMPL supports three-party training in the semi-honest setting. By setting alternate shares for the privileged party, pMPL is robust to tolerate one of the rest two parties dropping out during the training. With the above settings, we design a series of efficient protocols based on vector space secret sharing for pMPL to bridge the gap between vector space secret sharing and machine learning. Finally, the experimental results show that the performance of pMPL is promising when we compare it with the state-of-the-art MPL frameworks. Especially, in the LAN setting, pMPL is around $16\times$ and $5\times$ faster than TF-encrypted (with ABY3 as the back-end framework) for the linear regression, and logistic regression, respectively. Besides, the accuracy of trained models of linear regression, logistic regression, and BP neural networks can reach around 97%, 99%, and 96% on MNIST dataset respectively.