具有指数家庭可观察的潜在变量模型的通用无监督优化

论文标题

具有指数家庭可观察的潜在变量模型的通用无监督优化

Generic Unsupervised Optimization for a Latent Variable Model With Exponential Family Observables

论文作者

Mousavi, Hamid, Drefs, Jakob, Hirschberger, Florian, Lücke, Jörg

论文摘要

潜在变量模型（LVM）代表通过潜在变量的参数化函数观察到的变量。无监督学习的LVM的突出示例是概率的PCA或概率SC，它们都假设潜伏期的加权线性求和，以确定可观察到的高斯分布的平均值。但是，在许多情况下，可观察物不会遵循高斯分布。因此，对于无监督的学习，已经考虑了假设特定非高斯观察物的LVM。对于分布的特定选择，参数优化是具有挑战性的，只有少数几个贡献被视为具有更普遍可观察到的分布的LVM。在这里，我们考虑针对一系列不同分布定义的LVM，即可观察物可以遵循指数族的任何（常规）分布。提出的新型LVM类是为二进制潜在的，它使用最大化代替总结将潜在的可观察到可观察到。为了得出优化过程，我们遵循EM方法进行最大似然参数估计。我们表明，可以得出一组非常简洁的参数更新方程，这些方程具有所有指数家庭分布的功能形式相同的功能形式。因此，派生的通用优化可以应用于不同类型的度量数据以及不同类型的离散数据。同样，派生的优化方程可以与最近建议的变分加速度结合使用，该加速度同样适用于此处考虑的LVM。因此，该组合保持了派生优化过程的通用和直接适用性，但至关重要的是，可以有效地可扩展性。我们在数值上验证我们的分析结果并讨论了一些潜在的应用，例如学习方差结构，噪声类型估计和降解。

Latent variable models (LVMs) represent observed variables by parameterized functions of latent variables. Prominent examples of LVMs for unsupervised learning are probabilistic PCA or probabilistic SC which both assume a weighted linear summation of the latents to determine the mean of a Gaussian distribution for the observables. In many cases, however, observables do not follow a Gaussian distribution. For unsupervised learning, LVMs which assume specific non-Gaussian observables have therefore been considered. Already for specific choices of distributions, parameter optimization is challenging and only a few previous contributions considered LVMs with more generally defined observable distributions. Here, we consider LVMs that are defined for a range of different distributions, i.e., observables can follow any (regular) distribution of the exponential family. The novel class of LVMs presented is defined for binary latents, and it uses maximization in place of summation to link the latents to observables. To derive an optimization procedure, we follow an EM approach for maximum likelihood parameter estimation. We show that a set of very concise parameter update equations can be derived which feature the same functional form for all exponential family distributions. The derived generic optimization can consequently be applied to different types of metric data as well as to different types of discrete data. Also, the derived optimization equations can be combined with a recently suggested variational acceleration which is likewise generically applicable to the LVMs considered here. So, the combination maintains generic and direct applicability of the derived optimization procedure, but, crucially, enables efficient scalability. We numerically verify our analytical results and discuss some potential applications such as learning of variance structure, noise type estimation and denoising.

下载PDF全文

下载文献需遵守相关版权规定

论文标题