论文标题
用稳定的股份参数为单发拓扑为NAS供电
Powering One-shot Topological NAS with Stabilized Share-parameter Proxy
论文作者
论文摘要
NAS Method的训练效率和发现高性能模型的能力,因此,NAS方法引起了研究社区的极大兴趣。但是,以前基于单发的作品的搜索空间通常依赖于手工艺设计,并且在网络拓扑上的灵活性很短。在这项工作中,我们尝试通过在大规模拓扑的增强搜索空间(即超过3.4*10^10^10^10个不同的拓扑结构)中探索高性能网络体系结构来增强单发NAS。具体而言,提出的稳定的股份参数代理消除了在如此复杂空间中进行建筑搜索的困难,该代理使用随机梯度Langevin Dynamics来实现快速共享参数采样,以便在具有复杂拓扑结构的搜索空间中实现稳定的建筑性能测量。所提出的方法,即稳定的拓扑神经结构搜索(St-NAS),在Imakenet上达到了多添加(MADDS)约束下的最新性能。我们的LITE模型ST-NAS-A仅使用3.26亿MADDS实现76.4%的TOP-1准确性。我们中等的型号St-NAS-B可以达到77.9%的TOP-1准确性,仅需要5.03亿MADD。与单发NAS上的其他并发作品相比,我们的两个模型都提供了出色的性能。
One-shot NAS method has attracted much interest from the research community due to its remarkable training efficiency and capacity to discover high performance models. However, the search spaces of previous one-shot based works usually relied on hand-craft design and were short for flexibility on the network topology. In this work, we try to enhance the one-shot NAS by exploring high-performing network architectures in our large-scale Topology Augmented Search Space (i.e., over 3.4*10^10 different topological structures). Specifically, the difficulties for architecture searching in such a complex space has been eliminated by the proposed stabilized share-parameter proxy, which employs Stochastic Gradient Langevin Dynamics to enable fast shared parameter sampling, so as to achieve stabilized measurement of architecture performance even in search space with complex topological structures. The proposed method, namely Stablized Topological Neural Architecture Search (ST-NAS), achieves state-of-the-art performance under Multiply-Adds (MAdds) constraint on ImageNet. Our lite model ST-NAS-A achieves 76.4% top-1 accuracy with only 326M MAdds. Our moderate model ST-NAS-B achieves 77.9% top-1 accuracy just required 503M MAdds. Both of our models offer superior performances in comparison to other concurrent works on one-shot NAS.