论文标题
小组学习,应用于神经建筑搜索
Small-Group Learning, with Application to Neural Architecture Search
论文作者
论文摘要
在人类学习中,有效的学习方法是小组学习:一小群学生朝着相同的学习目标共同努力,他们向同龄人表达对一个话题的理解,比较他们的想法,并互相帮助解决问题。在本文中,我们旨在通过开发一种新颖的ML框架(SGL)来研究这种人类学习方法是否可以借用来训练更好的机器学习模型。在我们的框架中,具有不同模型体系结构的一组学习者(ML模型)通过利用他们的互补优势来协作互相帮助学习。具体而言,每个学习者都使用其中等训练的模型来生成伪标记的数据集并使用其他学习者生成的伪标记的数据集重新训练其模型。 SGL被配制为一个由三个学习阶段组成的多级优化框架:每个学习者都独立训练模型,并使用此模型执行伪标记;每个学习者都使用其他学习者伪造的数据集训练另一个模型;学习者通过最大程度地减少验证损失来改善架构。开发了一种有效的算法来解决多级优化问题。我们将SGL应用于神经架构搜索。 CIFAR-100,CIFAR-10和ImageNet的结果证明了我们方法的有效性。
In human learning, an effective learning methodology is small-group learning: a small group of students work together towards the same learning objective, where they express their understanding of a topic to their peers, compare their ideas, and help each other to trouble-shoot problems. In this paper, we aim to investigate whether this human learning method can be borrowed to train better machine learning models, by developing a novel ML framework -- small-group learning (SGL). In our framework, a group of learners (ML models) with different model architectures collaboratively help each other to learn by leveraging their complementary advantages. Specifically, each learner uses its intermediately trained model to generate a pseudo-labeled dataset and re-trains its model using pseudo-labeled datasets generated by other learners. SGL is formulated as a multi-level optimization framework consisting of three learning stages: each learner trains a model independently and uses this model to perform pseudo-labeling; each learner trains another model using datasets pseudo-labeled by other learners; learners improve their architectures by minimizing validation losses. An efficient algorithm is developed to solve the multi-level optimization problem. We apply SGL for neural architecture search. Results on CIFAR-100, CIFAR-10, and ImageNet demonstrate the effectiveness of our method.