论文标题

传奇:实现$ \ MATHCAL {O}(ε^{ - 2})$联合Min-Max学习中的通信复杂性

SAGDA: Achieving $\mathcal{O}(ε^{-2})$ Communication Complexity in Federated Min-Max Learning

论文作者

Yang, Haibo, Liu, Zhuqing, Zhang, Xin, Liu, Jia

论文摘要

为了降低联邦最低最大学习的沟通复杂性,一种自然的方法是利用与传统联合学习相同的频率通信(通过多个本地更新)的概念。但是,由于联邦最低 - 最大学习中更为复杂的室间问题结构,因此对联邦最小的最大学历学习的理论理解和不经常的通信在文献中仍然非常有限。对于使用非i.i.d的设置尤其如此。数据集和部分客户参与。为了应对这一挑战,在本文中,我们提出了一个新的算法框架,称为随机抽样平均梯度下降上升(SAGDA),i)i)i)汇编了从随机采样的客户端作为控制变量的随机采样端口估计器,并在服务器和客户端侧面利用两个学习率。我们表明,SAGDA可以根据客户次数和本地更新步骤实现线性加速,该步骤产生了$ \ Mathcal {o}(ε^{ - 2})$通信复杂性,该顺序比艺术的状态低。有趣的是,通过指出标准的联合随机梯度下降(FSGDA)实际上是SAGDA的无控制性特殊版本,我们立即到达了$ \ Mathcal {O}(ε^{ - 2})$ for fsgda的通信结果。因此,通过SAGDA的镜头,我们还提高了对联合最小值学习的标准FSGDA方法的通信复杂性的当前理解。

To lower the communication complexity of federated min-max learning, a natural approach is to utilize the idea of infrequent communications (through multiple local updates) same as in conventional federated learning. However, due to the more complicated inter-outer problem structure in federated min-max learning, theoretical understandings of communication complexity for federated min-max learning with infrequent communications remain very limited in the literature. This is particularly true for settings with non-i.i.d. datasets and partial client participation. To address this challenge, in this paper, we propose a new algorithmic framework called stochastic sampling averaging gradient descent ascent (SAGDA), which i) assembles stochastic gradient estimators from randomly sampled clients as control variates and ii) leverages two learning rates on both server and client sides. We show that SAGDA achieves a linear speedup in terms of both the number of clients and local update steps, which yields an $\mathcal{O}(ε^{-2})$ communication complexity that is orders of magnitude lower than the state of the art. Interestingly, by noting that the standard federated stochastic gradient descent ascent (FSGDA) is in fact a control-variate-free special version of SAGDA, we immediately arrive at an $\mathcal{O}(ε^{-2})$ communication complexity result for FSGDA. Therefore, through the lens of SAGDA, we also advance the current understanding on communication complexity of the standard FSGDA method for federated min-max learning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源