关于贝叶斯伪镜的发散度量

论文标题

关于贝叶斯伪镜的发散度量

On Divergence Measures for Bayesian Pseudocoresets

论文作者

Kim, Balhae, Choi, Jungwon, Lee, Seanie, Lee, Yoonho, Ha, Jung-Woo, Lee, Juho

论文摘要

贝叶斯伪孔赛是一个小的合成数据集，后验的参数近似于原始数据集的参数。尽管有希望，但贝叶斯假镜的可伸缩性尚未在现实的问题（例如与深神经网络的图像分类）中进行验证。另一方面，数据集蒸馏方法类似地构造了一个小数据集，因此使用合成数据集的优化将具有完整数据优化的性能竞争性竞争性竞争性。尽管数据集蒸馏已在大规模设置中得到了经验验证，但该框架仅限于点估计，并且尚未探讨它们对贝叶斯推理的适应性。本文将两种代表性的数据集蒸馏算法施放为通过最小化特定差异测量方法来构建假镜的方法的近似值：反向KL Divergence和Wasserstein距离。此外，我们提供了贝叶斯伪层建筑中这种差异措施的统一观点。最后，我们提出了一种新型的贝叶斯伪孔赛算法，基于最小化前向KL差异。我们的经验结果表明，从这些方法构成的伪镜像即使在高维贝叶斯的推理问题中也反映了真正的后部。

A Bayesian pseudocoreset is a small synthetic dataset for which the posterior over parameters approximates that of the original dataset. While promising, the scalability of Bayesian pseudocoresets is not yet validated in realistic problems such as image classification with deep neural networks. On the other hand, dataset distillation methods similarly construct a small dataset such that the optimization using the synthetic dataset converges to a solution with performance competitive with optimization using full data. Although dataset distillation has been empirically verified in large-scale settings, the framework is restricted to point estimates, and their adaptation to Bayesian inference has not been explored. This paper casts two representative dataset distillation algorithms as approximations to methods for constructing pseudocoresets by minimizing specific divergence measures: reverse KL divergence and Wasserstein distance. Furthermore, we provide a unifying view of such divergence measures in Bayesian pseudocoreset construction. Finally, we propose a novel Bayesian pseudocoreset algorithm based on minimizing forward KL divergence. Our empirical results demonstrate that the pseudocoresets constructed from these methods reflect the true posterior even in high-dimensional Bayesian inference problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题