有效的自适应结合图像分类

论文标题

有效的自适应结合图像分类

Efficient Adaptive Ensembling for Image Classification

论文作者

Bruno, Antonio, Moroni, Davide, Martinelli, Massimo

论文摘要

最近，除了零星的情况外，计算机视觉的趋势是与大量复杂性增加相比，取得了较小的改进。为了扭转这一趋势，我们提出了一种新的方法来增强图像分类性能而不会增加复杂性。为此，我们重新审视了一种强大的方法，通常由于其更复杂的性质和训练时间而无法正确使用，以使其通过特定的设计选择使其可行。首先，我们培训了两种Extricnet-B0端到端型号（已知是图像分类的最佳总体准确性/复杂性权衡的体系结构）在数据的脱节子集（即装袋）上。然后，我们通过对可训练的组合层进行微调来制作有效的自适应集合。通过这种方式，我们能够在准确性上平均胜过最先进的0.5 $ \％$，在参数数量（5-60次）方面具有约束的复杂性，每秒的浮点操作（flops）在几个主要的基础标准数据集上都在10-100次。

In recent times, with the exception of sporadic cases, the trend in Computer Vision is to achieve minor improvements compared to considerable increases in complexity. To reverse this trend, we propose a novel method to boost image classification performances without increasing complexity. To this end, we revisited ensembling, a powerful approach, often not used properly due to its more complex nature and the training time, so as to make it feasible through a specific design choice. First, we trained two EfficientNet-b0 end-to-end models (known to be the architecture with the best overall accuracy/complexity trade-off for image classification) on disjoint subsets of data (i.e. bagging). Then, we made an efficient adaptive ensemble by performing fine-tuning of a trainable combination layer. In this way, we were able to outperform the state-of-the-art by an average of 0.5$\%$ on the accuracy, with restrained complexity both in terms of the number of parameters (by 5-60 times), and the FLoating point Operations Per Second (FLOPS) by 10-100 times on several major benchmark datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题