TKIL：班级平衡增量学习的切线内核方法

论文标题

TKIL：班级平衡增量学习的切线内核方法

TKIL: Tangent Kernel Approach for Class Balanced Incremental Learning

论文作者

Xiang, Jinlin, Shlizerman, Eli

论文摘要

当以依次的方式学习新任务时，深层神经网络倾向于忘记他们以前学到的任务，这种现象称为灾难性遗忘。班级增量学习方法旨在通过保留以前学到的任务的一些示例并从中蒸馏出知识来解决此问题。但是，现有的方法努力平衡跨课程的性能，因为它们通常将模型过于最新任务。在我们的工作中，我们建议通过引入实现阶级平衡性能的渐进学习（TKIL）的新型方法来应对这些挑战。该方法保留了跨阶层的表示形式，并平衡了每个类别的准确性，因此可以实现更好的总体准确性和差异。 TKIL方法基于神经切线内核（NTK），该神经网络将神经网络作为无限宽度极限的内核函数的收敛行为。在tkil中，特征层之间的梯度被视为这些层的表示之间的距离，可以定义为切线核损耗（GTK损耗），因此将其与平均重量一起最小化。这允许TKIL自动识别任务并在推理过程中快速适应它。具有各种增量学习设置的CIFAR-100和Imagenet数据集的实验表明，这些策略允许TKIL优于现有的最新方法。

When learning new tasks in a sequential manner, deep neural networks tend to forget tasks that they previously learned, a phenomenon called catastrophic forgetting. Class incremental learning methods aim to address this problem by keeping a memory of a few exemplars from previously learned tasks, and distilling knowledge from them. However, existing methods struggle to balance the performance across classes since they typically overfit the model to the latest task. In our work, we propose to address these challenges with the introduction of a novel methodology of Tangent Kernel for Incremental Learning (TKIL) that achieves class-balanced performance. The approach preserves the representations across classes and balances the accuracy for each class, and as such achieves better overall accuracy and variance. TKIL approach is based on Neural Tangent Kernel (NTK), which describes the convergence behavior of neural networks as a kernel function in the limit of infinite width. In TKIL, the gradients between feature layers are treated as the distance between the representations of these layers and can be defined as Gradients Tangent Kernel loss (GTK loss) such that it is minimized along with averaging weights. This allows TKIL to automatically identify the task and to quickly adapt to it during inference. Experiments on CIFAR-100 and ImageNet datasets with various incremental learning settings show that these strategies allow TKIL to outperform existing state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题