论文标题
低任务多样性的诅咒:关于转移学习到胜过MAML的失败及其经验等效性
The Curse of Low Task Diversity: On the Failure of Transfer Learning to Outperform MAML and Their Empirical Equivalence
论文作者
论文摘要
最近,已经观察到,转移学习解决方案可能是我们解决许多少量学习基准的全部 - 因此提出了有关何时以及如何部署元学习算法的重要问题。在本文中,我们试图通过1。提出一个新颖的指标(多样性系数),以衡量几次学习基准中的任务多样性和2。。使用多样性系数,我们表明流行的迷你胶原和Cifar-fs很少的学习基准的多样性较低。这种新颖的洞察力将转移学习解决方案比在公平比较的低多样性方面的元学习解决方案更好。具体而言,我们从经验上发现,低多样性系数与转移学习和MAML学习解决方案之间的高相似性在元测试时间和分类层相似性(使用基于特征的距离指标)等方面相似。为了进一步支持我们的主张,我们发现这种元检验的准确性仍然存在,即使模型尺寸发生了变化。因此,我们得出结论,在低多样性制度中,MAML和转移学习在公平比较时具有同等的元检验性能。我们也希望我们的工作激发了对未来元学习基准测试的更周到的结构和定量评估。
Recently, it has been observed that a transfer learning solution might be all we need to solve many few-shot learning benchmarks -- thus raising important questions about when and how meta-learning algorithms should be deployed. In this paper, we seek to clarify these questions by 1. proposing a novel metric -- the diversity coefficient -- to measure the diversity of tasks in a few-shot learning benchmark and 2. by comparing Model-Agnostic Meta-Learning (MAML) and transfer learning under fair conditions (same architecture, same optimizer, and all models trained to convergence). Using the diversity coefficient, we show that the popular MiniImageNet and CIFAR-FS few-shot learning benchmarks have low diversity. This novel insight contextualizes claims that transfer learning solutions are better than meta-learned solutions in the regime of low diversity under a fair comparison. Specifically, we empirically find that a low diversity coefficient correlates with a high similarity between transfer learning and MAML learned solutions in terms of accuracy at meta-test time and classification layer similarity (using feature based distance metrics like SVCCA, PWCCA, CKA, and OPD). To further support our claim, we find this meta-test accuracy holds even as the model size changes. Therefore, we conclude that in the low diversity regime, MAML and transfer learning have equivalent meta-test performance when both are compared fairly. We also hope our work inspires more thoughtful constructions and quantitative evaluations of meta-learning benchmarks in the future.