论文标题

G-Darts-A:注意的通道平行采样组

G-DARTS-A: Groups of Channel Parallel Sampling with Attention

论文作者

Wang, Zhaowen, Zhang, Wei, Wang, Zhiming

论文摘要

可区分的体系结构搜索(DARTS)为搜索有效的基于网络体系结构的梯度提供了基准,但伴随着搜索和培训网络体系结构中的庞大计算开销。最近,许多新颖的作品改善了飞镖。特别是,部分连接的飞镖(PC-DARTS)提出了部分通道采样技术,该技术取得了良好的结果。在这项工作中,我们发现飞镖提供的骨干很容易过度拟合。为了减轻此问题,我们建议使用多个通道进行搜索,提出了一种带有注意力(G-Darts-a)的群体darts的方法。受PC-Darts部分采样策略的启发,我们使用组频道来采样超级网络以执行更有效的搜索,同时保持网络信息的相对完整性。为了减轻频道组之间的竞争并保持频道平衡,我们遵循挤压和激发网络中的注意机制。每组频道共享定义的权重,因此它们可以提供不同的搜索建议。搜索的体系结构更强大,更适合不同的部署。具体而言,通过仅在飞镖上使用注意模块,我们在CIFAR10上获得了CIFAR10/100的2.82%/16.36%的错误率,用于CIFAR10上的搜索过程。将我们的G-DARTS-A应用于飞镖/PC-DARTS,CIFAR10的错误率为2.57%/2.61%,并实现0.5/0.4 GPU-DAYS。

Differentiable Architecture Search (DARTS) provides a baseline for searching effective network architectures based gradient, but it is accompanied by huge computational overhead in searching and training network architecture. Recently, many novel works have improved DARTS. Particularly, Partially-Connected DARTS(PC-DARTS) proposed the partial channel sampling technique which achieved good results. In this work, we found that the backbone provided by DARTS is prone to overfitting. To mitigate this problem, we propose an approach named Group-DARTS with Attention (G-DARTS-A), using multiple groups of channels for searching. Inspired by the partially sampling strategy of PC-DARTS, we use groups channels to sample the super-network to perform a more efficient search while maintaining the relative integrity of the network information. In order to relieve the competition between channel groups and keep channel balance, we follow the attention mechanism in Squeeze-and-Excitation Network. Each group of channels shares defined weights thence they can provide different suggestion for searching. The searched architecture is more powerful and better adapted to different deployments. Specifically, by only using the attention module on DARTS we achieved an error rate of 2.82%/16.36% on CIFAR10/100 with 0.3GPU-days for search process on CIFAR10. Apply our G-DARTS-A to DARTS/PC-DARTS, an error rate of 2.57%/2.61% on CIFAR10 with 0.5/0.4 GPU-days is achieved.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源