论文标题
通过中间插入性分解及其在多元化定量策略中的应用来选择功能
Feature Selection via the Intervened Interpolative Decomposition and its Application in Diversifying Quantitative Strategies
论文作者
论文摘要
在本文中,我们提出了一个用于计算插值分解(ID)的概率模型,其中观察到的矩阵的每一列具有其自身的优先级或重要性,因此分解的最终结果可以找到一组代表整个特征集的特征,并且所选功能也比其他功能更高。这种方法通常用于低级近似,特征选择和提取数据中的隐藏模式,其中矩阵因子是与每个数据维度相关的潜在变量。用于贝叶斯推断的Gibbs采样以进行优化。我们评估了现实世界数据集上的拟议模型,包括十个中国A股票股票,并证明了带有干预措施(IID)的拟议的贝叶斯ID算法会对现有的贝叶斯ID算法产生可比的重建错误,同时选择具有更高分数或优先级或优先级的特征。
In this paper, we propose a probabilistic model for computing an interpolative decomposition (ID) in which each column of the observed matrix has its own priority or importance, so that the end result of the decomposition finds a set of features that are representative of the entire set of features, and the selected features also have higher priority than others. This approach is commonly used for low-rank approximation, feature selection, and extracting hidden patterns in data, where the matrix factors are latent variables associated with each data dimension. Gibbs sampling for Bayesian inference is applied to carry out the optimization. We evaluate the proposed models on real-world datasets, including ten Chinese A-share stocks, and demonstrate that the proposed Bayesian ID algorithm with intervention (IID) produces comparable reconstructive errors to existing Bayesian ID algorithms while selecting features with higher scores or priority.