论文标题

多项式状态中内部产品内核矩阵的光谱和内核脊回归中多重下降现象

Spectrum of inner-product kernel matrices in the polynomial regime and multiple descent phenomenon in kernel ridge regression

论文作者

Misiakiewicz, Theodor

论文摘要

我们研究内部产物内核矩阵的频谱,即$ n \ times n $矩阵,带有条目$ h(\ langle \ textbf {x} _i,\ textbf {x} _j} _j \ rangle/d) $ \ mathbb {r}^d $中的协变量。在线性的高维状态$ n \ asymp d $中,证明这些矩阵的线性化非常近似,这简化为重新验证的Wishart矩阵和身份矩阵的总和。在本文中,我们将这种分解概括为多项式高维状态$ n \ asymp d^\ ell,\ ell \ in \ mathbb {n} $,用于在球体和超立方体上均匀分布的数据。在这个方案中,内核矩阵的程度很近似 - $ \ ell $多项式近似近似,可以分解为低级别的峰值矩阵,身份和``gegenbauer矩阵''条目$ q_ \ ell(\ ell(\ langle \ langle \ langle \ textbf {x} $ where textbf) $ q_ \ ell $是学位 - $ \ ell $ gegenbauer多项式。我们表明,Gegenbauer Matrix的频谱在分布中收敛到Marchenko-Pastur法律。 该问题是由多项式状态$ n \ asymp d^κ,κ> 0 $的内核脊回归(KRR)的预测误差的研究所激发的。先前的工作表明,对于$κ\ not \ in \ mathbb {n} $,krr完全适合 - $ \lfloorκ\ rfloor $ polyensialial $ polyenmial近似对目标函数。在本文中,我们使用内核矩阵的表征来完成这张图片,并用$ n/d/d^κ\ toψ$在\ Mathbb {n} $中计算限制$ n/d/d^κ\至ψ$的确切渐近差异。在这种情况下,测试误差可能会产生双重下降行为,具体取决于$κ$时的有效正则化和信噪比。因为每次$κ$都会越过整数时,这种双重下降可能会发生,所以这解释了在以前几项工作中观察到的KRR风险曲线中的多重下降现象。

We study the spectrum of inner-product kernel matrices, i.e., $n \times n$ matrices with entries $h (\langle \textbf{x}_i ,\textbf{x}_j \rangle/d)$ where the $( \textbf{x}_i)_{i \leq n}$ are i.i.d.~random covariates in $\mathbb{R}^d$. In the linear high-dimensional regime $n \asymp d$, it was shown that these matrices are well approximated by their linearization, which simplifies into the sum of a rescaled Wishart matrix and identity matrix. In this paper, we generalize this decomposition to the polynomial high-dimensional regime $n \asymp d^\ell,\ell \in \mathbb{N}$, for data uniformly distributed on the sphere and hypercube. In this regime, the kernel matrix is well approximated by its degree-$\ell$ polynomial approximation and can be decomposed into a low-rank spike matrix, identity and a `Gegenbauer matrix' with entries $Q_\ell (\langle \textbf{x}_i , \textbf{x}_j \rangle)$, where $Q_\ell$ is the degree-$\ell$ Gegenbauer polynomial. We show that the spectrum of the Gegenbauer matrix converges in distribution to a Marchenko-Pastur law. This problem is motivated by the study of the prediction error of kernel ridge regression (KRR) in the polynomial regime $n \asymp d^κ, κ>0$. Previous work showed that for $κ\not\in \mathbb{N}$, KRR fits exactly a degree-$\lfloor κ\rfloor$ polynomial approximation to the target function. In this paper, we use our characterization of the kernel matrix to complete this picture and compute the precise asymptotics of the test error in the limit $n/d^κ\to ψ$ with $κ\in \mathbb{N}$. In this case, the test error can present a double descent behavior, depending on the effective regularization and signal-to-noise ratio at level $κ$. Because this double descent can occur each time $κ$ crosses an integer, this explains the multiple descent phenomenon in the KRR risk curve observed in several previous works.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源