论文标题

离散图形模型中变异推断的概率电路

Probabilistic Circuits for Variational Inference in Discrete Graphical Models

论文作者

Shih, Andy, Ermon, Stefano

论文摘要

由于无法重新参数化证据下限(ELBO)的梯度,因此很难在具有变异方法的离散图形模型中的推论。已经提出了许多基于抽样的方法来估计这些梯度,但是它们具有很高的偏见或方差。在本文中,我们提出了一种新方法,该方法利用概率电路模型(例如SUM产品网络(SPN))的障碍,以确切地计算Elbo梯度(无需采样),以达到一定类别的密度。特别是,我们表明选择性SPN适合作为表达变异分布,并证明当目标模型的对数密度为多项式时,可以通过分析计算相应的ELBO。为了扩展到具有数千个变量的图形模型,我们开发了具有$ o(kn)$的选择性SPN的有效构建,其中$ n $是变量的数量,$ k $是可调的超参数。我们在三种类型的图形模型中演示了我们的方法:ISING模型,潜在的Dirichlet分配以及UAI推理竞争的因子图。选择性SPN比均值场和结构化的平均场提供更好的下限,并且具有不提供下限的近似值,例如循环信念传播和树木覆盖的信仰传播。我们的结果表明,概率电路是离散图形模型中变异推断的有前途的工具,因为它们结合了障碍性和表现力。

Inference in discrete graphical models with variational methods is difficult because of the inability to re-parameterize gradients of the Evidence Lower Bound (ELBO). Many sampling-based methods have been proposed for estimating these gradients, but they suffer from high bias or variance. In this paper, we propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN), to compute ELBO gradients exactly (without sampling) for a certain class of densities. In particular, we show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is a polynomial the corresponding ELBO can be computed analytically. To scale to graphical models with thousands of variables, we develop an efficient and effective construction of selective-SPNs with size $O(kn)$, where $n$ is the number of variables and $k$ is an adjustable hyperparameter. We demonstrate our approach on three types of graphical models -- Ising models, Latent Dirichlet Allocation, and factor graphs from the UAI Inference Competition. Selective-SPNs give a better lower bound than mean-field and structured mean-field, and is competitive with approximations that do not provide a lower bound, such as Loopy Belief Propagation and Tree-Reweighted Belief Propagation. Our results show that probabilistic circuits are promising tools for variational inference in discrete graphical models as they combine tractability and expressivity.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源