论文标题

通过等级揭示QR分解的特征选择的子空间学习:具有非阴性基质分解和进化算法的无监督和混合方法

Subspace Learning for Feature Selection via Rank Revealing QR Factorization: Unsupervised and Hybrid Approaches with Non-negative Matrix Factorization and Evolutionary Algorithm

论文作者

Moslemi, Amir, Ahmadian, Arash

论文摘要

从高维数据中选择最有用和歧视性的特征是机器学习和数据工程中的重要主题。使用基于矩阵分解的技术(例如非负矩阵分解)进行特征选择已成为特征选择中的热门话题。使用矩阵分解的特征选择的主要目标是提取一个近似原始空间但较低维度的子空间。在这项研究中,将QR(RRQR)分解的等级分解,该分解比单数值分解(SVD)更便宜,它被利用是获得最有用的特征作为一种新型的无监督特征选择技术。该技术使用QR的置换矩阵进行特征选择,这是该分解方法的唯一属性。此外,将QR分解嵌入非阴性矩阵分解(NMF)目标函数中,作为一种新的无监督特征选择方法。最后,通过耦合RRQR作为基于滤波器的技术和遗传算法作为基于包装的技术提出了混合特征选择算法。在这种方法中,使用RRQR分解去除冗余特征,并使用遗传算法选择了最歧视性的特征子集。所提出的算法表明,与监督,无监督和半监督的设置相比,与最新的特征选择算法相比,相比是可靠且健壮的。使用KNN,SVM和C4.5分类器对七个可用的微阵列数据集测试了所有方法。在评估指标方面,实验结果表明,所提出的方法与最新的特征选择相媲美。

The selection of most informative and discriminative features from high-dimensional data has been noticed as an important topic in machine learning and data engineering. Using matrix factorization-based techniques such as nonnegative matrix factorization for feature selection has emerged as a hot topic in feature selection. The main goal of feature selection using matrix factorization is to extract a subspace which approximates the original space but in a lower dimension. In this study, rank revealing QR (RRQR) factorization, which is computationally cheaper than singular value decomposition (SVD), is leveraged in obtaining the most informative features as a novel unsupervised feature selection technique. This technique uses the permutation matrix of QR for feature selection which is a unique property to this factorization method. Moreover, QR factorization is embedded into non-negative matrix factorization (NMF) objective function as a new unsupervised feature selection method. Lastly, a hybrid feature selection algorithm is proposed by coupling RRQR, as a filter-based technique, and a Genetic algorithm as a wrapper-based technique. In this method, redundant features are removed using RRQR factorization and the most discriminative subset of features are selected using the Genetic algorithm. The proposed algorithm shows to be dependable and robust when compared against state-of-the-art feature selection algorithms in supervised, unsupervised, and semi-supervised settings. All methods are tested on seven available microarray datasets using KNN, SVM and C4.5 classifiers. In terms of evaluation metrics, the experimental results shows that the proposed method is comparable with the state-of-the-art feature selection.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源