论文标题

通过曲率感知的梯度过滤进行自我调整随机优化

Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering

论文作者

Chen, Ricky T. Q., Choi, Dami, Balles, Lukas, Duvenaud, David, Hennig, Philipp

论文摘要

标准的一阶随机优化算法仅基于平均迷你批次梯度,并且已显示跟踪额外数量(例如曲率)可以帮助消除敏感性的常见超级参数。基于这种直觉,我们探讨了精确的每样本Hessian-vector产品和梯度的使用来构建自我调整和无参数的优化器。基于梯度的动力学模型,我们得出了一个过程,该过程导致曲率校正,自发性的在线梯度估计。我们更新的平滑度使其更适合简单的步长选择方案,我们也取决于估计数量。我们证明我们的基于模型的过程在嘈杂的二次设置中收敛。尽管我们在深度学习任务中没有看到类似的收益,但我们可以匹配精心调整的优化器的性能,最终,这是构建自我调整优化器的有趣步骤。

Standard first-order stochastic optimization algorithms base their updates solely on the average mini-batch gradient, and it has been shown that tracking additional quantities such as the curvature can help de-sensitize common hyperparameters. Based on this intuition, we explore the use of exact per-sample Hessian-vector products and gradients to construct optimizers that are self-tuning and hyperparameter-free. Based on a dynamics model of the gradient, we derive a process which leads to a curvature-corrected, noise-adaptive online gradient estimate. The smoothness of our updates makes it more amenable to simple step size selection schemes, which we also base off of our estimates quantities. We prove that our model-based procedure converges in the noisy quadratic setting. Though we do not see similar gains in deep learning tasks, we can match the performance of well-tuned optimizers and ultimately, this is an interesting step for constructing self-tuning optimizers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源