迈向实用的几击查询集：换电性最小描述长度推理

论文标题

迈向实用的几击查询集：换电性最小描述长度推理

Towards Practical Few-Shot Query Sets: Transductive Minimum Description Length Inference

论文作者

Martin, Ségolène, Boudiaf, Malik, Chouzenoux, Emilie, Pesquet, Jean-Christophe, Ayed, Ismail Ben

论文摘要

标准的几个基准通常建立在简化的查询集上的假设之上，而这些假设可能并不总是在实践中。特别是，对于测试时间的每个任务，在未标记的查询集中有效存在的类是先验的，并且与标记的支持集合集中表示的类完全对应。我们放松这些假设并扩展当前的基准测试，以使给定任务的查询集类别未知，但仅属于更大的可能类。我们的设置可以被视为极具挑战性但实用的问题的实例，即k道路分类极为不平衡，k比标准基准测试中通常使用的值大得多，并且在支持集中有可能无关紧要的监督。预计，我们的设置会导致最先进的方法的性能下降。在这些观察结果的激励下，我们引入了一个原始的双重最小描述长度（PADDLE）公式，该公式在支撑集的监督限制下，在给定的几次任务中平衡了数据拟合精度和模型复杂性。我们受限的类似MDL的目标促进了大量可能的类别之间的竞争，仅保留有效的类别，这些类别可以更好地遵守一些射击任务的数据。它是不含超级参数的，可以在任何基础培训的顶部应用。此外，我们得出了一种快速块坐标下降算法，以通过收敛保证和每次迭代的线性计算复杂性来优化我们的目标。对标准少数数据集的全面实验以及更现实，更具挑战性的I-NAT数据集显示了我们方法的竞争性能，因此，当任务中可能的类数量增加时。我们的代码可在https://github.com/segolenemartin/paddle上公开获取。

Standard few-shot benchmarks are often built upon simplifying assumptions on the query sets, which may not always hold in practice. In particular, for each task at testing time, the classes effectively present in the unlabeled query set are known a priori, and correspond exactly to the set of classes represented in the labeled support set. We relax these assumptions and extend current benchmarks, so that the query-set classes of a given task are unknown, but just belong to a much larger set of possible classes. Our setting could be viewed as an instance of the challenging yet practical problem of extremely imbalanced K-way classification, K being much larger than the values typically used in standard benchmarks, and with potentially irrelevant supervision from the support set. Expectedly, our setting incurs drops in the performances of state-of-the-art methods. Motivated by these observations, we introduce a PrimAl Dual Minimum Description LEngth (PADDLE) formulation, which balances data-fitting accuracy and model complexity for a given few-shot task, under supervision constraints from the support set. Our constrained MDL-like objective promotes competition among a large set of possible classes, preserving only effective classes that befit better the data of a few-shot task. It is hyperparameter free, and could be applied on top of any base-class training. Furthermore, we derive a fast block coordinate descent algorithm for optimizing our objective, with convergence guarantee, and a linear computational complexity at each iteration. Comprehensive experiments over the standard few-shot datasets and the more realistic and challenging i-Nat dataset show highly competitive performances of our method, more so when the numbers of possible classes in the tasks increase. Our code is publicly available at https://github.com/SegoleneMartin/PADDLE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题