论文标题
关于通过异质数据减少联邦学习的部分差异的有效性
On the effectiveness of partial variance reduction in federated learning with heterogeneous data
论文作者
论文摘要
跨客户的数据异质性是联合学习的关键挑战。先前的工作通过对齐客户端和服务器模型或使用控制变体来纠正客户端模型漂移来解决此问题。尽管这些方法在凸面或简单的非凸问题中实现了快速的收敛,但缺乏过度参数化模型(例如深神经网络)的性能。在本文中,我们首先在深神网络中重新访问了广泛使用的FedAvg算法,以了解数据异质性如何影响整个神经网络层的梯度更新。我们观察到,尽管FedAvg有效地学习了特征提取层,但客户跨客户的最终分类层的实质性多样性阻碍了性能。在此激励的情况下,我们建议仅在最终层上通过降低差异来纠正模型漂移。我们证明,这显着优于现有基准,以类似或较低的沟通成本。此外,我们为算法的收敛速率提供了证明。
Data heterogeneity across clients is a key challenge in federated learning. Prior works address this by either aligning client and server models or using control variates to correct client model drift. Although these methods achieve fast convergence in convex or simple non-convex problems, the performance in over-parameterized models such as deep neural networks is lacking. In this paper, we first revisit the widely used FedAvg algorithm in a deep neural network to understand how data heterogeneity influences the gradient updates across the neural network layers. We observe that while the feature extraction layers are learned efficiently by FedAvg, the substantial diversity of the final classification layers across clients impedes the performance. Motivated by this, we propose to correct model drift by variance reduction only on the final layers. We demonstrate that this significantly outperforms existing benchmarks at a similar or lower communication cost. We furthermore provide proof for the convergence rate of our algorithm.