神经建筑搜索的快速绩效估计

论文标题

神经建筑搜索的快速绩效估计

Speedy Performance Estimation for Neural Architecture Search

论文作者

Ru, Binxin, Lyle, Clare, Schut, Lisa, Fil, Miroslav, van der Wilk, Mark, Gal, Yarin

论文摘要

对拟议体系结构的概括性能的可靠而有效的评估对于神经体系结构搜索（NAS）的成功至关重要。传统方法面临各种局限性：培训每个架构完成完成的昂贵，早期停止验证精度可能与训练有素的性能差异很大，而基于模型的估计器需要大量的培训集。相反，我们建议根据简单的训练速度估算最终测试性能。从理论上讲，我们的估计量是由概括和训练速度之间的联系的动机，也受到贝叶斯环境下约束的Pac-bayes的重新启发。我们的无模型估计器实施简单，高效且便宜，并且在部署前不需要高参数调整或替代培训。我们在各种NAS搜索空间上证明了我们的估计器在与真实的测试性能排名更好的相关性方面始终优于其他替代方案。我们进一步表明，我们的估计器可以轻松地将其纳入基于查询的NAS和单发方法中，以提高搜索速度或质量。

Reliable yet efficient evaluation of generalisation performance of a proposed architecture is crucial to the success of neural architecture search (NAS). Traditional approaches face a variety of limitations: training each architecture to completion is prohibitively expensive, early stopped validation accuracy may correlate poorly with fully trained performance, and model-based estimators require large training sets. We instead propose to estimate the final test performance based on a simple measure of training speed. Our estimator is theoretically motivated by the connection between generalisation and training speed, and is also inspired by the reformulation of a PAC-Bayes bound under the Bayesian setting. Our model-free estimator is simple, efficient, and cheap to implement, and does not require hyperparameter-tuning or surrogate training before deployment. We demonstrate on various NAS search spaces that our estimator consistently outperforms other alternatives in achieving better correlation with the true test performance rankings. We further show that our estimator can be easily incorporated into both query-based and one-shot NAS methods to improve the speed or quality of the search.

下载PDF全文

下载文献需遵守相关版权规定

论文标题