用于分散非凸优化的快速随机增量梯度方法

论文标题

用于分散非凸优化的快速随机增量梯度方法

A fast randomized incremental gradient method for decentralized non-convex optimization

论文作者

Xin, Ran, Khan, Usman A., Kar, Soummya

论文摘要

我们研究了在节点网络上描述的分散的非凸有限和最小化问题，其中每个节点都具有局部的数据样本。在这种情况下，我们分析了一种称为GT-SAGA的单时间计算随机增量梯度方法。 GT-SAGA在计算上是有效的，因为它可以通过利用节点级别的差异和网络级梯度跟踪来评估每个迭代节点的一个组件梯度，并通过利用节点级别差异和网络级别的梯度跟踪来实现快速稳健的性能。对于一般平滑的非凸问题，我们显示了GT-SAGA与一阶固定点的几乎确定和均值的收敛，并进一步描述了实践意义的制度，在这些方案中，它比现有方法分别优于现有方法并实现与网络拓扑独立的迭代迭代复杂性。当全局函数满足Polyak-lojaciewisz条件时，我们表明GT-SAGA在期望中表现出线性收敛到最佳解决方案，并描述了实践兴趣的制度，其中性能是与网络拓扑无关的，并且可以改善现有方法。包括数值实验，以突出非凸面设置中GT-SAGA的主要收敛方面。

We study decentralized non-convex finite-sum minimization problems described over a network of nodes, where each node possesses a local batch of data samples. In this context, we analyze a single-timescale randomized incremental gradient method, called GT-SAGA. GT-SAGA is computationally efficient as it evaluates one component gradient per node per iteration and achieves provably fast and robust performance by leveraging node-level variance reduction and network-level gradient tracking. For general smooth non-convex problems, we show the almost sure and mean-squared convergence of GT-SAGA to a first-order stationary point and further describe regimes of practical significance where it outperforms the existing approaches and achieves a network topology-independent iteration complexity respectively. When the global function satisfies the Polyak-Lojaciewisz condition, we show that GT-SAGA exhibits linear convergence to an optimal solution in expectation and describe regimes of practical interest where the performance is network topology-independent and improves upon the existing methods. Numerical experiments are included to highlight the main convergence aspects of GT-SAGA in non-convex settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题