论文标题

在良好的属性及其与线性回归的关系上

On the well-spread property and its relation to linear regression

论文作者

Chen, Hongjie, d'Orsi, Tommaso

论文摘要

我们考虑强大的线性回归模型$ \ boldsymbol {y} =xβ^* + \boldsymbolη$,其中对对手的对手却不见于设计$ x \ in \ mathbb {r}^r}^{n \ times d} $可能会选择$ \boldsymbolη$(可能是腐败的)$boldsymbolη$ bold bold bold bold bold bold,任意方式。最近的工作[DLN+21,DNS21]引入了有效算法,以持续恢复参数矢量。这些算法至关重要地依赖于设计矩阵非常广泛(如果其列跨度远非任何稀疏矢量,矩阵就可以很好地扩展)。 在本文中,我们表明存在一个缺乏良好性的设计矩阵家族,因此在理论上不可能在上述稳健线性回归模型中持续恢复参数向量。 我们进一步研究了随机矩阵的良好表现的平均案例时间复杂性。我们表明,如果观测值的数量在环境维度上是二次的,则可以有效地证明给定的$ n $ -b $ d $ d $ d $高斯矩阵是否会很好地扩展。当观察值为$ O(d^2)$时,我们通过显示出相同认证问题的计算硬度的严格证据(以低度多项式的形式)来补充这一结果。

We consider the robust linear regression model $\boldsymbol{y} = Xβ^* + \boldsymbolη$, where an adversary oblivious to the design $X \in \mathbb{R}^{n \times d}$ may choose $\boldsymbolη$ to corrupt all but a (possibly vanishing) fraction of the observations $\boldsymbol{y}$ in an arbitrary way. Recent work [dLN+21, dNS21] has introduced efficient algorithms for consistent recovery of the parameter vector. These algorithms crucially rely on the design matrix being well-spread (a matrix is well-spread if its column span is far from any sparse vector). In this paper, we show that there exists a family of design matrices lacking well-spreadness such that consistent recovery of the parameter vector in the above robust linear regression model is information-theoretically impossible. We further investigate the average-case time complexity of certifying well-spreadness of random matrices. We show that it is possible to efficiently certify whether a given $n$-by-$d$ Gaussian matrix is well-spread if the number of observations is quadratic in the ambient dimension. We complement this result by showing rigorous evidence -- in the form of a lower bound against low-degree polynomials -- of the computational hardness of this same certification problem when the number of observations is $o(d^2)$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源