论文标题
信息理论特征通过张量分解和表个性选择
Information-theoretic Feature Selection via Tensor Decomposition and Submodularity
论文作者
论文摘要
通过在选择最佳相关特征的最佳子集方面,通过最大化所选特征向量和目标变量之间的高阶互信息来选择特征选择是黄金标准,从而最大程度地提高了预测模型的性能。但是,这种方法通常需要了解所有特征和目标的多元概率分布,并涉及一个具有挑战性的组合优化问题。最近的工作表明,任何联合概率质量函数(PMF)可以通过规范多核(张量秩)分解表示为天真的贝叶斯模型。在本文中,我们介绍了所有变量的关节PMF的低级张量模型,并间接靶向靶向,以减轻复杂性并最大程度地提高给定数量的特征的分类性能。通过关节PMF的低排名建模,可以通过学习关节分布的主要成分来规避维度的诅咒。通过间接旨在预测幼稚贝叶斯模型的潜在变量,而不是原始目标变量,可以将特征选择问题提出,因为单位酮的下一个功能最大化受到基数限制的最大化 - 可以使用带有性能保证的贪婪算法来解决该功能。使用几个标准数据集的数值实验表明,对于这个重要问题,提出的方法与最先进的方法进行了比较。
Feature selection by maximizing high-order mutual information between the selected feature vector and a target variable is the gold standard in terms of selecting the best subset of relevant features that maximizes the performance of prediction models. However, such an approach typically requires knowledge of the multivariate probability distribution of all features and the target, and involves a challenging combinatorial optimization problem. Recent work has shown that any joint Probability Mass Function (PMF) can be represented as a naive Bayes model, via Canonical Polyadic (tensor rank) Decomposition. In this paper, we introduce a low-rank tensor model of the joint PMF of all variables and indirect targeting as a way of mitigating complexity and maximizing the classification performance for a given number of features. Through low-rank modeling of the joint PMF, it is possible to circumvent the curse of dimensionality by learning principal components of the joint distribution. By indirectly aiming to predict the latent variable of the naive Bayes model instead of the original target variable, it is possible to formulate the feature selection problem as maximization of a monotone submodular function subject to a cardinality constraint - which can be tackled using a greedy algorithm that comes with performance guarantees. Numerical experiments with several standard datasets suggest that the proposed approach compares favorably to the state-of-art for this important problem.