论文标题
FastSTMF:稀疏数据有效的热带矩阵分解算法
FastSTMF: Efficient tropical matrix factorization algorithm for sparse data
论文作者
论文摘要
矩阵分解是机器学习中最流行的方法之一,最近从使用热带半段的预测任务中引入了非线性,从而受益。非线性使得更好地拟合极值和分布,从而发现了与标准线性代数相差不同的高变化模式。但是,各种热带基质分解方法的优化过程慢。在我们的工作中,我们提出了一种基于稀疏的热带矩阵分解(STMF)的新方法FastSTMF,该方法引入了一种新的策略,用于更新因子矩阵,从而导致有效的计算性能。我们评估了FASTSTMF对来自TCGA数据库的合成和真实基因表达数据的效率,结果表明,FastSTMF在准确性和运行时都优于STMF。与NMF相比,我们表明FASTSTMF在某些数据集上的性能更好,并且不容易拟合为NMF。这项工作为使用新提出的优化过程基于许多其他半连接开发其他矩阵分解技术的基础为基础。
Matrix factorization, one of the most popular methods in machine learning, has recently benefited from introducing non-linearity in prediction tasks using tropical semiring. The non-linearity enables a better fit to extreme values and distributions, thus discovering high-variance patterns that differ from those found by standard linear algebra. However, the optimization process of various tropical matrix factorization methods is slow. In our work, we propose a new method FastSTMF based on Sparse Tropical Matrix Factorization (STMF), which introduces a novel strategy for updating factor matrices that results in efficient computational performance. We evaluated the efficiency of FastSTMF on synthetic and real gene expression data from the TCGA database, and the results show that FastSTMF outperforms STMF in both accuracy and running time. Compared to NMF, we show that FastSTMF performs better on some datasets and is not prone to overfitting as NMF. This work sets the basis for developing other matrix factorization techniques based on many other semirings using a new proposed optimization process.