Vargrad：变异推理的低变异梯度估计器

论文标题

Vargrad：变异推理的低变异梯度估计器

VarGrad: A Low-Variance Gradient Estimator for Variational Inference

论文作者

Richter, Lorenz, Boustati, Ayman, Nüsken, Nikolas, Ruiz, Francisco J. R., Akyildiz, Ömer Deniz

论文摘要

我们基于带有一对输出控制变量的得分函数方法，分析了ELBO的无偏梯度估计器的特性。我们表明，可以使用新的损失来获得此梯度估计器，该新损失定义为确切的后验和变分近似之间的对数比率的方差，我们称之为$ \ textit {log-variance损失} $。在某些条件下，对数变化损失的梯度等于（负）ELBO的梯度。从理论上讲，我们表明该梯度估计器（我们称之为$ \ textIt {vargrad} $，由于其与对数差异损失的联系，在某些设置中的分数函数方法的差异较低，并且剩下的一个输出控制变量系数接近最佳。我们从经验上证明，与离散VAE的其他最先进的估计器相比，Vargrad提供了有利的差异与计算权衡。

We analyse the properties of an unbiased gradient estimator of the ELBO for variational inference, based on the score function method with leave-one-out control variates. We show that this gradient estimator can be obtained using a new loss, defined as the variance of the log-ratio between the exact posterior and the variational approximation, which we call the $\textit{log-variance loss}$. Under certain conditions, the gradient of the log-variance loss equals the gradient of the (negative) ELBO. We show theoretically that this gradient estimator, which we call $\textit{VarGrad}$ due to its connection to the log-variance loss, exhibits lower variance than the score function method in certain settings, and that the leave-one-out control variate coefficients are close to the optimal ones. We empirically demonstrate that VarGrad offers a favourable variance versus computation trade-off compared to other state-of-the-art estimators on a discrete VAE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题