论文标题

降低差异是对拜占庭的解毒剂:更高的速率,较弱的假设和沟通压缩为樱桃的顶部

Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top

论文作者

Gorbunov, Eduard, Horváth, Samuel, Richtárik, Peter, Gidel, Gauthier

论文摘要

由于对协作和联合学习的兴趣的增长,拜占庭式的企业一直在引起广泛关注。但是,许多富有成果的方向,例如减少差异以实现稳健性和降低沟通成本的沟通压缩,在该领域仍然很少探索。这项工作解决了这一差距,并提出了BYZ-VR-MAR​​INA-一种具有差异降低和压缩的新型拜占庭耐受方法。我们论文的关键信息是,降低方差是更有效地与拜占庭工人作战的关键。同时,沟通压缩是一种使过程更有效的奖励。我们得出BYZ-VR-Marina的理论收敛保证,优于先前的一般非凸和polyak-lojasiewicz损失函数的先前最新。与并发的具有差异降低和/或压缩的拜占庭式抗议方法不同,我们的复杂性结果很紧,不依赖限制性假设,例如梯度的界限或有限的压缩。此外,我们提供了支持随机梯度非均匀采样的拜占庭耐受方法的首次分析。数值实验证实了我们的理论发现。

Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in collaborative and federated learning. However, many fruitful directions, such as the usage of variance reduction for achieving robustness and communication compression for reducing communication costs, remain weakly explored in the field. This work addresses this gap and proposes Byz-VR-MARINA - a new Byzantine-tolerant method with variance reduction and compression. A key message of our paper is that variance reduction is key to fighting Byzantine workers more effectively. At the same time, communication compression is a bonus that makes the process more communication efficient. We derive theoretical convergence guarantees for Byz-VR-MARINA outperforming previous state-of-the-art for general non-convex and Polyak-Lojasiewicz loss functions. Unlike the concurrent Byzantine-robust methods with variance reduction and/or compression, our complexity results are tight and do not rely on restrictive assumptions such as boundedness of the gradients or limited compression. Moreover, we provide the first analysis of a Byzantine-tolerant method supporting non-uniform sampling of stochastic gradients. Numerical experiments corroborate our theoretical findings.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源