论文标题
Langevin算法的固定分布的浓度
Concentration of the Langevin Algorithm's Stationary Distribution
论文作者
论文摘要
Loge-Concave采样的一种规范算法是Langevin算法,又称Langevin扩散运行,以某些离散化步骤$η> 0 $。这种离散化导致Langevin算法具有固定分布$π_η$,这与Langevin扩散的固定分布$π$不同,并且了解$π$的知名属性是否扩展到$π_η$,这是一个重要的挑战。特别是,尽管诸如等级和快速衰减的尾巴之类的浓度属性以$π$而闻名,但$π_η$的类似属性是具有算法含义的开放问题。该注释通过建立$π_η$的浓度结果,以$π$的形式进行经典结果,从而为朝这个方向提供了第一步。具体而言,我们表明,对于任何非平凡的步骤$η> 0 $,$π_η$分别是次指数(分别为sub-gaussian),当电势分别为凸(分别为凸)时(强烈凸出)。此外,我们显示的集中界基本紧密。我们还表明,这些浓度界限沿Langevin算法的轨迹扩展到所有迭代,以及使用梯度下估计的不精确实现。 我们分析的关键是使用旋转不变的力矩生成函数(又称贝塞尔函数)来研究langevin算法的固定动力学。该技术可能具有独立的关注,因为它可以直接分析离散时间固定分布$π_η$,而无需浏览连续的时间固定分布$π$作为中介。
A canonical algorithm for log-concave sampling is the Langevin Algorithm, aka the Langevin Diffusion run with some discretization stepsize $η> 0$. This discretization leads the Langevin Algorithm to have a stationary distribution $π_η$ which differs from the stationary distribution $π$ of the Langevin Diffusion, and it is an important challenge to understand whether the well-known properties of $π$ extend to $π_η$. In particular, while concentration properties such as isoperimetry and rapidly decaying tails are classically known for $π$, the analogous properties for $π_η$ are open questions with algorithmic implications. This note provides a first step in this direction by establishing concentration results for $π_η$ that mirror classical results for $π$. Specifically, we show that for any nontrivial stepsize $η> 0$, $π_η$ is sub-exponential (respectively, sub-Gaussian) when the potential is convex (respectively, strongly convex). Moreover, the concentration bounds we show are essentially tight. We also show that these concentration bounds extend to all iterates along the trajectory of the Langevin Algorithm, and to inexact implementations which use sub-Gaussian estimates of the gradient. Key to our analysis is the use of a rotation-invariant moment generating function (aka Bessel function) to study the stationary dynamics of the Langevin Algorithm. This technique may be of independent interest because it enables directly analyzing the discrete-time stationary distribution $π_η$ without going through the continuous-time stationary distribution $π$ as an intermediary.