论文标题
加速交替的最小化和对强凸度的适应性
Accelerated Alternating Minimization and Adaptability to Strong Convexity
论文作者
论文摘要
在本文的第一部分中,我们考虑使用$ l $ -lipschitz-continuel梯度的凸功能加速的一阶优化方法,该方法能够自动适应满足Polyak-olak-olojasiewicz条件的问题,或者具有强烈的convex,其值均高于$μ$。在这些情况下,如果$μ$未知,则方法与因子$ \fracμ{l} $呈现线性收敛。如果问题强烈凸出并且已知$μ$,那么该方法与因子$ \ sqrt {\fracμ{l}} $具有线性收敛。如果不是情况,则该方法会以$ o(1/k^2)$的费率收敛。第二部分包含该方法对问题的概括,该方法允许交替的最小化和相同的渐近收敛速率的证明。此外,它也被认为是称为自适应催化剂的方法,它允许将收敛速率提高到$ O(1/k^2)$,并且还提供了与AGM,交替的AGM,APDAGD和Sinkhorn的算法的实验比较,以将双重问题用于离散的呼吸端接最佳运输问题。这项工作的结果是试图解释AGM交替收敛速度比AGM或适应性催化剂更快的原因,尽管渐近理论速率$ O(1/K^2)$。该假设依赖于交替适应强凸度的事实。该假设在二次函数上进行了检验,交替的AGM也显示出更快的收敛性。
In the first part of the paper we consider accelerated first order optimization method for convex functions with $L$-Lipschitz-continuous gradient, that is able to automatically adapts to problems which satisfies Polyak-Łojasiewicz condition or which is strongly convex with the value of parameter equal to $μ$. In these cases method poses linear convergence with factor $\fracμ{L}$, if $μ$ is unknown. If the problem is strongly convex and $μ$ is known, than the method possesses linear convergence with the factor $\sqrt{\fracμ{L}}$. If that are not the cases, the method converges with a rate $O(1/k^2)$. The second part contains generalization of the method to the problems, that allows alternating minimization and proofs of the same asymptotic convergence rates. Also it is considered the approach called Adaptive Catalyst, which allows to increase convergence rate up to $O(1/k^2)$ and also it is provided an experimental comparison of the approach with AGM, Alternating AGM, APDAGD and Sinkhorn's algorithm for the dual problem to the discrete entropically regularized optimal transport problem. The result of the work is the attempt to explain the reason why Alternating AGM converge faster than AGM or Adaptive Catalyst despite of the asymptotic theoretical rate $O(1/k^2)$. The hypothesis relies on the fact that Alternating AGM adapts to strong convexity. The hypothesis was tested on quadratic functions, on that Alternating AGM also showed faster convergence.