近似概率推断的基础后验

论文标题

近似概率推断的基础后验

Foundation Posteriors for Approximate Probabilistic Inference

论文作者

Wu, Mike, Goodman, Noah

论文摘要

概率程序为生成模型提供了表达性表示语言。给定一个概率程序，我们对后推理的任务感兴趣：估计一个被观察到的变量的潜在变量。现有的概率计划中推断的技术通常需要选择许多超参数，在计算上是昂贵的，并且/或仅适用于限制类别的程序。在这里，我们将推断作为掩盖语言建模：给定程序，我们生成了一个监督的变量和分配数据集，并随机掩盖了一部分作业的子集。然后，我们训练神经网络以揭示随机值，从而定义了近似的后验分布。通过在一系列程序中优化单个神经网络，我们可以摊销培训的成本，从而产生“基础”后部能够对新程序进行零射击推断。通过优化变异推理目标，也可以针对特定程序和数据集进行微调。我们在Stan程序的基准上显示了该方法的功效，零射和微调。

Probabilistic programs provide an expressive representation language for generative models. Given a probabilistic program, we are interested in the task of posterior inference: estimating a latent variable given a set of observed variables. Existing techniques for inference in probabilistic programs often require choosing many hyper-parameters, are computationally expensive, and/or only work for restricted classes of programs. Here we formulate inference as masked language modeling: given a program, we generate a supervised dataset of variables and assignments, and randomly mask a subset of the assignments. We then train a neural network to unmask the random values, defining an approximate posterior distribution. By optimizing a single neural network across a range of programs we amortize the cost of training, yielding a "foundation" posterior able to do zero-shot inference for new programs. The foundation posterior can also be fine-tuned for a particular program and dataset by optimizing a variational inference objective. We show the efficacy of the approach, zero-shot and fine-tuned, on a benchmark of STAN programs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题