论文标题
在线学习中的遗憾差异权衡
A Regret-Variance Trade-Off in Online Learning
论文作者
论文摘要
我们考虑通过专家建议对强烈凸出和有限的损失进行预测,并调查遗憾和“差异”之间的权衡(即,学习者的预测和最佳专家预测的平方差异)。凭借$ k $的专家,已知加权平均(EWA)算法可以实现$ O(\ log K)$遗憾。我们证明,EWA的一种变体要么实现负面的遗憾(即,算法优于最好的专家),要么保证了差异和遗憾的$ o(\ log k)$。在此结果的基础上,我们展示了如何在学习中利用预测差异的几个示例。在在线到批处理分析中,我们表明,较大的经验差异允许在线停止在线转换,并超越班级最佳预测因子的风险。当我们不考虑尽早停止时,我们还恢复了模型选择聚合的最佳速率。在损失损失的在线预测中,我们表明腐败对遗憾的影响可以通过巨大的差异来补偿。在在线选择性抽样中,我们设计了一种算法,该算法在差异很大时采样较少,同时保证了期望的最佳遗憾。在弃权的在线学习中,我们使用类似的术语作为差异来得出在这种情况下束缚的第一个高概率$ o(\ log k)$遗憾。最后,我们将结果扩展到在线线性回归的设置。
We consider prediction with expert advice for strongly convex and bounded losses, and investigate trade-offs between regret and "variance" (i.e., squared difference of learner's predictions and best expert predictions). With $K$ experts, the Exponentially Weighted Average (EWA) algorithm is known to achieve $O(\log K)$ regret. We prove that a variant of EWA either achieves a negative regret (i.e., the algorithm outperforms the best expert), or guarantees a $O(\log K)$ bound on both variance and regret. Building on this result, we show several examples of how variance of predictions can be exploited in learning. In the online to batch analysis, we show that a large empirical variance allows to stop the online to batch conversion early and outperform the risk of the best predictor in the class. We also recover the optimal rate of model selection aggregation when we do not consider early stopping. In online prediction with corrupted losses, we show that the effect of corruption on the regret can be compensated by a large variance. In online selective sampling, we design an algorithm that samples less when the variance is large, while guaranteeing the optimal regret bound in expectation. In online learning with abstention, we use a similar term as the variance to derive the first high-probability $O(\log K)$ regret bound in this setting. Finally, we extend our results to the setting of online linear regression.