论文标题
正常逆高斯混合模型的变异贝叶斯对非高斯数据的聚类
Clustering of non-Gaussian data by variational Bayes for normal inverse Gaussian mixture models
论文作者
论文摘要
有限混合模型(通常是高斯混合物)是众所周知的,广泛用作基于模型的聚类。在实际情况下,有许多重尾和/或不对称的非高斯数据。正常的逆高斯(Nig)分布是正常变化的平均值,混合密度是逆高斯分布,可用于Haavy-Tail和不对称性。对于NIG混合模型,已经提出了期望最大化方法和变异贝叶斯(VB)算法。但是,现有的Nig混合物的VB算法具有劣势,即混合密度的形状受到限制。在本文中,我们提出了另一种用于NIG混合物的VB算法,以改善缺点。我们还提出了Dirichlet工艺混合模型的扩展,以克服确定有限混合模型中簇数量的困难。我们用人工数据评估了性能,发现它的表现优于高斯混合物和针对NIG混合物的现有实现,尤其是对于高度非规范的数据。
Finite mixture models, typically Gaussian mixtures, are well known and widely used as model-based clustering. In practical situations, there are many non-Gaussian data that are heavy-tailed and/or asymmetric. Normal inverse Gaussian (NIG) distributions are normal-variance mean which mixing densities are inverse Gaussian distributions and can be used for both haavy-tail and asymmetry. For NIG mixture models, both expectation-maximization method and variational Bayesian (VB) algorithms have been proposed. However, the existing VB algorithm for NIG mixture have a disadvantage that the shape of the mixing density is limited. In this paper, we propose another VB algorithm for NIG mixture that improves on the shortcomings. We also propose an extension of Dirichlet process mixture models to overcome the difficulty in determining the number of clusters in finite mixture models. We evaluated the performance with artificial data and found that it outperformed Gaussian mixtures and existing implementations for NIG mixtures, especially for highly non-normative data.