可扩展的超参数优化使用懒惰的高斯流程

论文标题

可扩展的超参数优化使用懒惰的高斯流程

Scalable Hyperparameter Optimization with Lazy Gaussian Processes

论文作者

Ram, Raju, Müller, Sabine, Pfreundt, Franz-Josef, Gauger, Nicolas R., Keuper, Janis

论文摘要

大多数机器学习方法都需要仔细选择超参数，以便训练具有良好概括能力的高性能模型。因此，已经引入了几种自动选择算法，以克服这些参数的繁琐手册（尝试和误差）。由于其样品效率很高，因此对参数空间的高斯工艺建模的贝叶斯优化已成为选择方法。不幸的是，这种方法由于基本的cholesky分解而遭受了立方计算的复杂性，这使得很难超越少量的采样步骤。在本文中，我们提出了基础高斯过程的新颖，高度准确的近似。将其从立方体降低到二次的计算复杂性可以有效地对贝叶斯优化进行有效的强缩放，同时在优化精度方面优于先前的方法。第一个实验显示，单节点的加速度为162倍，在平行环境中进一步提高了5倍。

Most machine learning methods require careful selection of hyper-parameters in order to train a high performing model with good generalization abilities. Hence, several automatic selection algorithms have been introduced to overcome tedious manual (try and error) tuning of these parameters. Due to its very high sample efficiency, Bayesian Optimization over a Gaussian Processes modeling of the parameter space has become the method of choice. Unfortunately, this approach suffers from a cubic compute complexity due to underlying Cholesky factorization, which makes it very hard to be scaled beyond a small number of sampling steps. In this paper, we present a novel, highly accurate approximation of the underlying Gaussian Process. Reducing its computational complexity from cubic to quadratic allows an efficient strong scaling of Bayesian Optimization while outperforming the previous approach regarding optimization accuracy. The first experiments show speedups of a factor of 162 in single node and further speed up by a factor of 5 in a parallel environment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题