使用随机Kaczmarz方法线性判别分析

论文标题

使用随机Kaczmarz方法线性判别分析

Linear Discriminant Analysis with the Randomized Kaczmarz Method

论文作者

Chi, Jocelyn T., Needell, Deanna

论文摘要

我们提出了一种随机的kaczmarz方法，用于线性判别分析（RKLDA），这是一种迭代的随机方法，用于用于非常大数据的二进制级高斯模型线性判别分析（LDA）。我们利用最小二乘配方并动员随机梯度下降框架，以获得具有性能的随机分类器，该分类器可以达到与完整数据LDA相当的精度。如果人们采用随机的kaczmarz解决方案来代替完整的数据最小二乘解决方案，则我们对LDA判别函数的预期变化进行分析，该解决方案既说明了数据的高斯建模假设和算法随机性。我们的分析表明，预期的变化如何取决于数据中固有的数量，例如输入数据的缩放条件编号和Frobenius Norm，线性模型的拟合程度以及从随机算法中的选择。我们的实验表明，RKLDA可以在一系列步骤尺寸和迭代次数方面提供完整数据LDA的可行替代方案。

We present a randomized Kaczmarz method for linear discriminant analysis (rkLDA), an iterative randomized approach to binary-class Gaussian model linear discriminant analysis (LDA) for very large data. We harness a least squares formulation and mobilize the stochastic gradient descent framework to obtain a randomized classifier with performance that can achieve comparable accuracy to that of full data LDA. We present analysis for the expected change in the LDA discriminant function if one employs the randomized Kaczmarz solution in lieu of the full data least squares solution that accounts for both the Gaussian modeling assumptions on the data and algorithmic randomness. Our analysis shows how the expected change depends on quantities inherent in the data such as the scaled condition number and Frobenius norm of the input data, how well the linear model fits the data, and choices from the randomized algorithm. Our experiments demonstrate that rkLDA can offer a viable alternative to full data LDA on a range of step-sizes and numbers of iterations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题