论文标题
凸孔座鞍点问题的一般乐观方法
Generalized Optimistic Methods for Convex-Concave Saddle Point Problems
论文作者
论文摘要
乐观的梯度方法已经越来越受欢迎,可以解决凸 - 凸座鞍点问题。为了分析其迭代复杂性,最近的一项工作[ARXIV:1906.01115]提出了一个有趣的观点,将此方法解释为近端方法的近似值。在本文中,我们遵循这种方法,并提炼出乐观的基本思想,以提出一种普遍的乐观方法,其中包括乐观的梯度方法作为一种特殊情况。我们的一般框架可以通过复合目标功能处理受约束的马鞍点问题,并可以使用Bregman距离使用任意规范。此外,我们开发了一个回溯线搜索方案,以选择步骤大小,而不了解平滑系数。我们使用一阶,第二和高阶的甲壳实例化方法,并提供最著名的全球迭代复杂性界限。对于我们的一阶方法,我们表明,当目标函数是凸孔concave时,平均迭代率以$ o(1/n)$的速率收敛,并且当物镜是强烈的convex-rong-convex-trongnong-rong-concave时,它会达到线性收敛。对于我们的第二和高阶方法,在距离生成函数具有Lipschitz梯度的其他假设下,我们证明了$ O(1/ε^\ frac {2} {2} {p+1})$的复杂性,在convex-concave设置和复杂性设置和复杂性。 $ o(((l_pd^\ frac {p-1} {2}/μ)^\ frac {2} {2} {p+1}+\ log \ log \ log \ log \ frac {1}ε)$在强烈convex-convex-convex-convex-convex-stronglongly-concave设置中衍生产品,$μ$是强凸参数,而$ d $是与鞍点的初始距离。此外,事实证明,我们的行搜索方案平均只需要持续数量的呼叫来解决子问题解决者,这使我们的一阶和二阶方法尤其适合实施。
The optimistic gradient method has seen increasing popularity for solving convex-concave saddle point problems. To analyze its iteration complexity, a recent work [arXiv:1906.01115] proposed an interesting perspective that interprets this method as an approximation to the proximal point method. In this paper, we follow this approach and distill the underlying idea of optimism to propose a generalized optimistic method, which includes the optimistic gradient method as a special case. Our general framework can handle constrained saddle point problems with composite objective functions and can work with arbitrary norms using Bregman distances. Moreover, we develop a backtracking line search scheme to select the step sizes without knowledge of the smoothness coefficients. We instantiate our method with first-, second- and higher-order oracles and give best-known global iteration complexity bounds. For our first-order method, we show that the averaged iterates converge at a rate of $O(1/N)$ when the objective function is convex-concave, and it achieves linear convergence when the objective is strongly-convex-strongly-concave. For our second- and higher-order methods, under the additional assumption that the distance-generating function has Lipschitz gradient, we prove a complexity bound of $O(1/ε^\frac{2}{p+1})$ in the convex-concave setting and a complexity bound of $O((L_pD^\frac{p-1}{2}/μ)^\frac{2}{p+1}+\log\log\frac{1}ε)$ in the strongly-convex-strongly-concave setting, where $L_p$ ($p\geq 2$) is the Lipschitz constant of the $p$-th-order derivative, $μ$ is the strong convexity parameter, and $D$ is the initial Bregman distance to the saddle point. Moreover, our line search scheme provably only requires a constant number of calls to a subproblem solver per iteration on average, making our first- and second-order methods particularly amenable to implementation.