论文标题
通过条件梯度类型方法进行健壮的结构化统计估计
Robust Structured Statistical Estimation via Conditional Gradient Type Methods
论文作者
论文摘要
结构化统计估计问题通常是通过条件梯度(CG)类型方法来解决的,以避免计算昂贵的投影操作。但是,现有的CG类型方法对数据损坏并不强大。为了解决这个问题,我们建议针对Huber的腐败模型和重尾数据来鲁棒化CG类型方法。首先,我们证明两种成对CG方法是稳定的,即不要累积误差。因此,与稳健的平均梯度估计技术相结合,我们可以保证对广泛的问题的鲁棒性,但现在在无投射算法框架中。接下来,我们考虑高维问题。鲁棒的基于平均估计的方法可能具有不可接受的样本复杂性。当约束集是$ \ ell_0 $ norm ball时,最近已经开发了基于迭代抑制势头的方法。然而,即使对于具有$ O(d)$ Extreme积分的一般组合的一般组,扩展也是不平凡的。为了设置可行集的设置$ o(\ text {poly}(d))$ Extreme点,我们根据一种新的条件开发了一种新颖的鲁棒性方法,我们称为稳健的原子选择条件(RASC)。满足RASC时,我们的方法会以相应的统计误差线性收敛,样品复杂性在问题的稀疏性中正确缩放,而不是基于强大的平均估计方法所要求的,而不是环境维度。
Structured statistical estimation problems are often solved by Conditional Gradient (CG) type methods to avoid the computationally expensive projection operation. However, the existing CG type methods are not robust to data corruption. To address this, we propose to robustify CG type methods against Huber's corruption model and heavy-tailed data. First, we show that the two Pairwise CG methods are stable, i.e., do not accumulate error. Combined with robust mean gradient estimation techniques, we can therefore guarantee robustness to a wide class of problems, but now in a projection-free algorithmic framework. Next, we consider high dimensional problems. Robust mean estimation based approaches may have an unacceptably high sample complexity. When the constraint set is a $\ell_0$ norm ball, Iterative-Hard-Thresholding-based methods have been developed recently. Yet extension is non-trivial even for general sets with $O(d)$ extreme points. For setting where the feasible set has $O(\text{poly}(d))$ extreme points, we develop a novel robustness method, based on a new condition we call the Robust Atom Selection Condition (RASC). When RASC is satisfied, our method converges linearly with a corresponding statistical error, with sample complexity that scales correctly in the sparsity of the problem, rather than the ambient dimension as would be required by any approach based on robust mean estimation.