论文标题
几乎是线性稀疏制度的词典学习
Dictionary Learning for the Almost-Linear Sparsity Regime
论文作者
论文摘要
词典学习,恢复稀疏使用的矩阵$ \ mathbf {d} \ in \ mathbb {r} $ \ mathbf {y} _i = \ Mathbf {d} \ Mathbf {X} _i $,对信号处理和数据科学中的应用程序的重要性越来越重要。 When the dictionary is known, recovery of $\mathbf{x}_i$ is possible even for sparsity linear in dimension $M$, yet to date, the only algorithms which provably succeed in the linear sparsity regime are Riemannian trust-region methods, which are limited to orthogonal dictionaries, and methods based on the sum-of-squares hierarchy, which requires super-polynomial为了获得损失$ m $的错误的时间。在这项工作中,我们介绍了零星(光谱甲骨文词典学习),这是一种有效的重新加权协方差矩阵家族的光谱方法。我们证明,在足够高的维度下,零星可以恢复过于循环($ k> m $)词典,即使稀疏性在尺寸上是线性的,即使在对数因素上是线性的,却满足了众所周知的限制等轴测特性(RIP)。此外,这些准确性保证具有``甲骨文的属性'',即可以在很高的可能性中准确地恢复未知稀疏矢量$ \ mathbf {x} _i $的支持和迹象,从而使$ \ mathbf {d} $任意地估计$ \ mathbf {d} $在polynomial time time time time time time time time poloty poloty time time time poloty time。这种收敛保证了在近线性稀疏状态下过度累积的RIP矩阵的保证。
Dictionary learning, the problem of recovering a sparsely used matrix $\mathbf{D} \in \mathbb{R}^{M \times K}$ and $N$ $s$-sparse vectors $\mathbf{x}_i \in \mathbb{R}^{K}$ from samples of the form $\mathbf{y}_i = \mathbf{D}\mathbf{x}_i$, is of increasing importance to applications in signal processing and data science. When the dictionary is known, recovery of $\mathbf{x}_i$ is possible even for sparsity linear in dimension $M$, yet to date, the only algorithms which provably succeed in the linear sparsity regime are Riemannian trust-region methods, which are limited to orthogonal dictionaries, and methods based on the sum-of-squares hierarchy, which requires super-polynomial time in order to obtain an error which decays in $M$. In this work, we introduce SPORADIC (SPectral ORAcle DICtionary Learning), an efficient spectral method on family of reweighted covariance matrices. We prove that in high enough dimensions, SPORADIC can recover overcomplete ($K > M$) dictionaries satisfying the well-known restricted isometry property (RIP) even when sparsity is linear in dimension up to logarithmic factors. Moreover, these accuracy guarantees have an ``oracle property" that the support and signs of the unknown sparse vectors $\mathbf{x}_i$ can be recovered exactly with high probability, allowing for arbitrarily close estimation of $\mathbf{D}$ with enough samples in polynomial time. To the author's knowledge, SPORADIC is the first polynomial-time algorithm which provably enjoys such convergence guarantees for overcomplete RIP matrices in the near-linear sparsity regime.