论文标题

Federboost:GBDT的私人联邦学习

FederBoost: Private Federated Learning for GBDT

论文作者

Tian, Zhihua, Zhang, Rui, Hou, Xiaoyang, Lyu, Lingjuan, Zhang, Tianyi, Liu, Jian, Ren, Kui

论文摘要

联邦学习(FL)一直是机器学习和人工智能的新兴趋势。它允许多个参与者协作培训更好的全球模型,并为模型培训提供隐私感知的范式,因为它不需要参与者发布其原始培训数据。但是,现有的用于垂直分区数据或决策树的FL解决方案需要大量的加密操作。在本文中,我们提出了一个名为Federboost的框架,用于私人联邦学习梯度的促进决策树(GBDT)。它支持在垂直和水平分区数据上运行GBDT。垂直Federboost不需要任何加密操作,水平Federboost仅需要轻巧的安全聚合。关键观察是,GBDT的整个训练过程依赖于数据的顺序而不是值。我们通过在三个公共数据集上进行的广泛实验来完全实施Federboost,并评估其效用和效率。我们的实验结果表明,通过集中式培训,垂直和水平的Federboost都达到了相同的准确性,在中央服务器中收集了所有数据,并且比联合决策树培训的最先进的解决方案要快4-5个数量级。因此,为工业应用提供实用解决方案。

Federated Learning (FL) has been an emerging trend in machine learning and artificial intelligence. It allows multiple participants to collaboratively train a better global model and offers a privacy-aware paradigm for model training since it does not require participants to release their original training data. However, existing FL solutions for vertically partitioned data or decision trees require heavy cryptographic operations. In this paper, we propose a framework named FederBoost for private federated learning of gradient boosting decision trees (GBDT). It supports running GBDT over both vertically and horizontally partitioned data. Vertical FederBoost does not require any cryptographic operation and horizontal FederBoost only requires lightweight secure aggregation. The key observation is that the whole training process of GBDT relies on the ordering of the data instead of the values. We fully implement FederBoost and evaluate its utility and efficiency through extensive experiments performed on three public datasets. Our experimental results show that both vertical and horizontal FederBoost achieve the same level of accuracy with centralized training where all data are collected in a central server, and they are 4-5 orders of magnitude faster than the state-of-the-art solutions for federated decision tree training; hence offering practical solutions for industrial applications.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源