信息理论界限有关联邦学习中的概括错误和隐私泄漏

论文标题

信息理论界限有关联邦学习中的概括错误和隐私泄漏

Information-Theoretic Bounds on the Generalization Error and Privacy Leakage in Federated Learning

论文作者

Yagli, Semih, Dytso, Alex, Poor, H. Vincent

论文摘要

在移动网络上运行的机器学习算法可以被描述为三个不同的类别。首先是最终用户设备将其数据发送到中央服务器的经典情况，该数据用于训练模型。第二是分布式设置，其中每个设备都训练自己的模型并将其模型参数发送到中央服务器，其中这些模型参数被汇总以创建一个最终模型。第三是联合学习设置，在任何给定的时间$ t $，一定数量的活跃最终用户使用自己的本地数据以及中央服务器提供的反馈训练，然后将其新估计的模型参数发送到中央服务器。然后，该服务器汇总了这些新参数，更新了自己的模型，并将更新的参数馈回所有最终用户，继续此过程直到收敛。这项工作的主要目的是为所有上述学习范式提供信息理论框架。此外，使用提供的框架，我们在概括误差上开发上和下限，以及经典，分布式和联合学习设置中隐私泄漏的界限。关键字：联合学习，分布式学习，机器学习，模型聚合。

Machine learning algorithms operating on mobile networks can be characterized into three different categories. First is the classical situation in which the end-user devices send their data to a central server where this data is used to train a model. Second is the distributed setting in which each device trains its own model and send its model parameters to a central server where these model parameters are aggregated to create one final model. Third is the federated learning setting in which, at any given time $t$, a certain number of active end users train with their own local data along with feedback provided by the central server and then send their newly estimated model parameters to the central server. The server, then, aggregates these new parameters, updates its own model, and feeds the updated parameters back to all the end users, continuing this process until it converges. The main objective of this work is to provide an information-theoretic framework for all of the aforementioned learning paradigms. Moreover, using the provided framework, we develop upper and lower bounds on the generalization error together with bounds on the privacy leakage in the classical, distributed and federated learning settings. Keywords: Federated Learning, Distributed Learning, Machine Learning, Model Aggregation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题