论文标题

(随机)非convex-concave minimax问题交替的近端梯度步骤

Alternating proximal-gradient steps for (stochastic) nonconvex-concave minimax problems

论文作者

Boţ, Radu Ioan, Böhm, Axel

论文摘要

$ \ min_x \ max_yψ(x,y)$的最小值问题在很大程度上引起了人们的兴趣,这在很大程度上是由于机器学习的进步,特别是生成的对抗网络。这些通常是使用两个玩家的随机梯度下降变体训练的。 尽管通过许多有效的解决方案方法可以选择凸的凸洞问题,但即使对于最简单的算法,这种设置之外的理论保证有时也缺乏。 特别是,交替的梯度下降是这种情况,两位代理轮流更新其策略。 为了部分缩小文献中的差距,我们证明了该方法的随机版本的新型全局收敛速率,用于在不convex-concave的设置中找到$ g(\ cdot)的临界点(\ cdot):= \ max_yψ(\ cdot,y)$。

Minimax problems of the form $\min_x \max_y Ψ(x,y)$ have attracted increased interest largely due to advances in machine learning, in particular generative adversarial networks. These are typically trained using variants of stochastic gradient descent for the two players. Although convex-concave problems are well understood with many efficient solution methods to choose from, theoretical guarantees outside of this setting are sometimes lacking even for the simplest algorithms. In particular, this is the case for alternating gradient descent ascent, where the two agents take turns updating their strategies. To partially close this gap in the literature we prove a novel global convergence rate for the stochastic version of this method for finding a critical point of $g(\cdot) := \max_y Ψ(\cdot,y)$ in a setting which is not convex-concave.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源