论文标题

高斯混合模型的张量矩:理论和应用

Tensor Moments of Gaussian Mixture Models: Theory and Applications

论文作者

Pereira, João M., Kileel, Joe, Kolda, Tamara G.

论文摘要

高斯混合模型(GMM)是统计和数据科学领域的基本工具。我们研究多元高斯和GMM的时刻。 $ n $二维的随机变量的$ d $ - themoment是对称的$ d $ - 张张量的尺寸$ n^d $的张量,因此以$ n^d $的矩量为$ n^d $,以$ d> 2 $和较大的$ n $的矩量很高。在这项工作中,我们使用GMM的矩张量为\ emph {隐式计算}开发理论和数值方法,将计算和存储成本降低至$ \ Mathcal {o}(n^2)$和$ \ Mathcal {o}(o}(o}(n^3)$,对于一般的covarianciance n $ and $ $ \ $ \ c。 $ \ Mathcal {o}(n)$用于对角线的$。我们根据对称张量产物来得出矩的简明分析表达式,依赖于对称张量和均相多项式之间的对应关系,以及涉及贝尔多项式的组合身份。该理论的主要应用是从一组观测值中估算GMM参数(均值和协方差),当将其作为矩匹配优化问题时。如果有一个已知且常见的协方差矩阵,我们还表明可以将数据观测值进行辩解,在这种情况下,估计未知意味着减少到对称的CP张量分解的问题。数值结果验证并说明了我们方法的数值效率。与期望最大化的方法相比,这项工作有可能为矩方法方法的竞争力打开,用于GMM的参数估计。

Gaussian mixture models (GMMs) are fundamental tools in statistical and data sciences. We study the moments of multivariate Gaussians and GMMs. The $d$-th moment of an $n$-dimensional random variable is a symmetric $d$-way tensor of size $n^d$, so working with moments naively is assumed to be prohibitively expensive for $d>2$ and larger values of $n$. In this work, we develop theory and numerical methods for \emph{implicit computations} with moment tensors of GMMs, reducing the computational and storage costs to $\mathcal{O}(n^2)$ and $\mathcal{O}(n^3)$, respectively, for general covariance matrices, and to $\mathcal{O}(n)$ and $\mathcal{O}(n)$, respectively, for diagonal ones. We derive concise analytic expressions for the moments in terms of symmetrized tensor products, relying on the correspondence between symmetric tensors and homogeneous polynomials, and combinatorial identities involving Bell polynomials. The primary application of this theory is to estimating GMM parameters (means and covariances) from a set of observations, when formulated as a moment-matching optimization problem. If there is a known and common covariance matrix, we also show it is possible to debias the data observations, in which case the problem of estimating the unknown means reduces to symmetric CP tensor decomposition. Numerical results validate and illustrate the numerical efficiency of our approaches. This work potentially opens the door to the competitiveness of the method of moments as compared to expectation maximization methods for parameter estimation of GMMs.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源