论文标题
耦合多个类的正则样品协方差矩阵估计器
Coupled regularized sample covariance matrix estimator for multiple classes
论文作者
论文摘要
具有有限的培训数据的多个类别的协方差矩阵估计是一个困难的问题。已知样品协方差矩阵(SCM)在变量数量较大时与可用数量的样品数量相比,性能较差。为了减少SCM的平均误差(MSE),经常使用正则化(收缩)SCM估计器。在这项工作中,我们考虑了多类问题的正则化SCM(RSCM)估计量,这些问题将两个不同的目标矩阵融为一体:类别的汇总(平均)SCM和缩放的身份矩阵。当人口协方差相似时,朝向合并的SCM的正则化是有益的,而对身份矩阵的正则化确保估计量是积极的。我们在估计器中得出MSE最佳调整参数,并提出了一种估计方法,假设类种群遵循有限的四阶矩(未指定的)椭圆形分布。通过模拟和正规化判别分析(RDA)分类设置,对所提出的耦合RSCM的MSE性能进行了评估。基于三个不同的实际数据集的结果表明与交叉验证相当,但在计算时间的速度有明显的速度。
The estimation of covariance matrices of multiple classes with limited training data is a difficult problem. The sample covariance matrix (SCM) is known to perform poorly when the number of variables is large compared to the available number of samples. In order to reduce the mean squared error (MSE) of the SCM, regularized (shrinkage) SCM estimators are often used. In this work, we consider regularized SCM (RSCM) estimators for multiclass problems that couple together two different target matrices for regularization: the pooled (average) SCM of the classes and the scaled identity matrix. Regularization toward the pooled SCM is beneficial when the population covariances are similar, whereas regularization toward the identity matrix guarantees that the estimators are positive definite. We derive the MSE optimal tuning parameters for the estimators as well as propose a method for their estimation under the assumption that the class populations follow (unspecified) elliptical distributions with finite fourth-order moments. The MSE performance of the proposed coupled RSCMs are evaluated with simulations and in a regularized discriminant analysis (RDA) classification set-up on real data. The results based on three different real data sets indicate comparable performance to cross-validation but with a significant speed-up in computation time.