论文标题
通过多任务高斯流程进行集群特定的预测
Cluster-Specific Predictions with Multi-Task Gaussian Processes
论文作者
论文摘要
引入了涉及高斯流程(GP)的模型,以同时处理多个功能数据的多任务学习,聚类和预测。该过程充当了功能数据的基于模型的聚类方法,也是对新任务进行后续预测的学习步骤。该模型是将多任务GPS与常见平均过程的混合物实例化。得出了一种用于处理超参数的优化以及超伴随者对潜在变量和过程的估计的优化。我们建立了明确的公式,用于将平均过程和潜在聚类变量整合到预测分布中,这考虑了这两个方面的不确定性。该分布定义为群集特异性GP预测的混合物,在处理组结构数据时,可以增强性能。该模型处理观察的不规则网格,并提供了有关协方差结构的不同假设,用于在任务中共享其他信息。聚类和预测任务的性能将通过各种模拟方案和真实数据集进行评估。总体算法称为magmaclust,作为R包公开使用。
A model involving Gaussian processes (GPs) is introduced to simultaneously handle multi-task learning, clustering, and prediction for multiple functional data. This procedure acts as a model-based clustering method for functional data as well as a learning step for subsequent predictions for new tasks. The model is instantiated as a mixture of multi-task GPs with common mean processes. A variational EM algorithm is derived for dealing with the optimisation of the hyper-parameters along with the hyper-posteriors' estimation of latent variables and processes. We establish explicit formulas for integrating the mean processes and the latent clustering variables within a predictive distribution, accounting for uncertainty on both aspects. This distribution is defined as a mixture of cluster-specific GP predictions, which enhances the performances when dealing with group-structured data. The model handles irregular grid of observations and offers different hypotheses on the covariance structure for sharing additional information across tasks. The performances on both clustering and prediction tasks are assessed through various simulated scenarios and real datasets. The overall algorithm, called MagmaClust, is publicly available as an R package.