TIADA：非Convex Minimax优化的时间尺度自适应算法

论文标题

TIADA：非Convex Minimax优化的时间尺度自适应算法

TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax Optimization

论文作者

Li, Xiang, Yang, Junchi, He, Niao

论文摘要

自适应梯度方法表明了它们以参数 - 反应方式飞行的步骤大小的能力，并从经验上实现更快的收敛来解决最小化问题。但是，当涉及到非凸最小值优化时，梯度下降上升（GDA）的当前收敛分析与自适应步骤尺寸相结合，需要仔细调整超参数和问题依赖性参数的知识。这种差异源于最小值问题的原始偶性性质，以及在达到收敛的原始更新和双重更新之间进行微妙的时间尺度分离的必要性。在这项工作中，我们提出了一种称为TIADA的单循环自适应GDA算法，用于非convex minimax优化，该算法会自动适应时间尺度的分离。我们的算法是完全参数 - 敏锐的，并且可以在非convex-rongnong-rong-conconcave minimax问题的确定性和随机设置中同时实现近乎最佳的复杂性。对于许多机器学习应用程序，该方法的有效性进一步证明是合理的。

Adaptive gradient methods have shown their ability to adjust the stepsizes on the fly in a parameter-agnostic manner, and empirically achieve faster convergence for solving minimization problems. When it comes to nonconvex minimax optimization, however, current convergence analyses of gradient descent ascent (GDA) combined with adaptive stepsizes require careful tuning of hyper-parameters and the knowledge of problem-dependent parameters. Such a discrepancy arises from the primal-dual nature of minimax problems and the necessity of delicate time-scale separation between the primal and dual updates in attaining convergence. In this work, we propose a single-loop adaptive GDA algorithm called TiAda for nonconvex minimax optimization that automatically adapts to the time-scale separation. Our algorithm is fully parameter-agnostic and can achieve near-optimal complexities simultaneously in deterministic and stochastic settings of nonconvex-strongly-concave minimax problems. The effectiveness of the proposed method is further justified numerically for a number of machine learning applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题