联合对手学习：融合分析的框架

论文标题

联合对手学习：融合分析的框架

Federated Adversarial Learning: A Framework with Convergence Analysis

论文作者

Li, Xiaoxiao, Song, Zhao, Yang, Jiaming

论文摘要

联合学习（FL）是一种趋势培训范式，用于利用分散培训数据。 FL允许客户端在本地更新几个时期的模型参数，然后将它们共享到全局模型以进行聚合。在聚集之前，该训练范式具有多本地步骤更新，以使对抗性攻击公开独特的漏洞。对手训练是一种流行而有效的方法，可以提高网络对抗者的鲁棒性。在这项工作中，我们制定了一种一般形式的联邦对抗学习（FAL），该形式来自集中式环境中的对抗性学习。在FL培训的客户端，FAL具有一个内部循环，可以生成对抗性训练的对抗样本和一个外循环以更新本地模型参数。在服务器端，FAL汇总了本地模型更新并广播汇总模型。我们设计了全球强大的训练损失，并将FAL培训作为最小 - 最大优化问题。与依赖梯度方向的经典集中式训练中的收敛分析不同，由于三个原因，很难在FAL中分析FAL的收敛性：1）Min-Max优化的复杂性，2）由于在聚合之前对客户端的多局部性更新以及3）间层间的异质性。我们通过使用适当的梯度近似和耦合技术来应对这些挑战，并在过度参数化的制度中介绍收敛分析。从理论上讲，我们的主要结果表明，我们算法下的最小损失可以融合到$ε$的情况下，并带有所选的学习率和交流回合。值得注意的是，我们的分析对于非IID客户是可行的。

Federated learning (FL) is a trending training paradigm to utilize decentralized training data. FL allows clients to update model parameters locally for several epochs, then share them to a global model for aggregation. This training paradigm with multi-local step updating before aggregation exposes unique vulnerabilities to adversarial attacks. Adversarial training is a popular and effective method to improve the robustness of networks against adversaries. In this work, we formulate a general form of federated adversarial learning (FAL) that is adapted from adversarial learning in the centralized setting. On the client side of FL training, FAL has an inner loop to generate adversarial samples for adversarial training and an outer loop to update local model parameters. On the server side, FAL aggregates local model updates and broadcast the aggregated model. We design a global robust training loss and formulate FAL training as a min-max optimization problem. Unlike the convergence analysis in classical centralized training that relies on the gradient direction, it is significantly harder to analyze the convergence in FAL for three reasons: 1) the complexity of min-max optimization, 2) model not updating in the gradient direction due to the multi-local updates on the client-side before aggregation and 3) inter-client heterogeneity. We address these challenges by using appropriate gradient approximation and coupling techniques and present the convergence analysis in the over-parameterized regime. Our main result theoretically shows that the minimum loss under our algorithm can converge to $ε$ small with chosen learning rate and communication rounds. It is noteworthy that our analysis is feasible for non-IID clients.

下载PDF全文

下载文献需遵守相关版权规定

论文标题