稀疏的随机网络，用于沟通效率的联合学习

论文标题

稀疏的随机网络，用于沟通效率的联合学习

Sparse Random Networks for Communication-Efficient Federated Learning

论文作者

Isik, Berivan, Pase, Francesco, Gunduz, Deniz, Weissman, Tsachy, Zorzi, Michele

论文摘要

联邦学习中的一个主要挑战是在每个回合中将体重更新与服务器交换的巨大沟通成本。尽管先前的工作在通过梯度压缩方法压缩重量更新方面取得了长足进展，但我们提出了一种根本不更新权重的截然不同的方法。取而代之的是，我们的方法在其初始\ emph {Random}值中冻结了权重，并学习了如何将随机网络稀释以获得最佳性能。为此，客户在培训A \ emph {随机}二进制掩码方面进行了协作，以在原始网络中找到最佳的稀疏随机网络。在培训结束时，最终模型是一个稀疏的网络，具有随机权重 - 或密集的随机网络中的子网。我们在各种系统配置下，在低比特率的低比特率方案中，我们显示了比MNIST，EMNIST，CIFAR-10和CIFAR-100数据集的相关基线的准确性，通信（BPP），收敛速度少于1美元（BPP），收敛速度（BPP）和最终型号大小（小于$ 1 $ BPP）的提高。

One main challenge in federated learning is the large communication cost of exchanging weight updates from clients to the server at each round. While prior work has made great progress in compressing the weight updates through gradient compression methods, we propose a radically different approach that does not update the weights at all. Instead, our method freezes the weights at their initial \emph{random} values and learns how to sparsify the random network for the best performance. To this end, the clients collaborate in training a \emph{stochastic} binary mask to find the optimal sparse random network within the original one. At the end of the training, the final model is a sparse network with random weights -- or a subnetwork inside the dense random network. We show improvements in accuracy, communication (less than $1$ bit per parameter (bpp)), convergence speed, and final model size (less than $1$ bpp) over relevant baselines on MNIST, EMNIST, CIFAR-10, and CIFAR-100 datasets, in the low bitrate regime under various system configurations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题