通过全面进步的贝叶斯神经网络进行强大的持续学习

论文标题

通过全面进步的贝叶斯神经网络进行强大的持续学习

Robust Continual Learning through a Comprehensively Progressive Bayesian Neural Network

论文作者

Yang, Guo, Wong, Cheryl Sze Yin, Savitha, Ramasamy

论文摘要

这项工作提出了一个全面进步的贝叶斯神经网络，用于稳健地学习一系列任务。贝叶斯神经网络逐渐修剪和生长，使得有足够的网络资源来表示一系列任务，而网络不爆炸。首先要争论，即类似的任务应具有相同数量的总网络资源，以确保在连续学习方案中公平地表示所有任务。因此，随着新任务流的数据，将足够的神经元添加到网络中，以使网络中每个层中的神经元总数，包括具有先前任务的共享表示和各个任务相关的表示表示，对于所有任务都是平等的。训练结束时冗余的权重也通过重新定位来修剪，以便在后续任务中有效使用。因此，网络逐渐增长，但可确保有效利用网络资源。我们将我们提出的方法称为“通过全面进步的贝叶斯神经网络（RCL-CPB）的稳健学习”，并在三种不同的持续学习方案下评估了MNIST数据集的拟议方法。除此之外，我们使用Split CIFAR100（每个5个类别的20个任务）和使用MNIST，SVHN和CIFAR10数据集的任务序列评估了RCL-CPB在任务的均一序列上的性能。演示和绩效结果表明，渐进式BNN提出的策略实现了稳健的持续学习。

This work proposes a comprehensively progressive Bayesian neural network for robust continual learning of a sequence of tasks. A Bayesian neural network is progressively pruned and grown such that there are sufficient network resources to represent a sequence of tasks, while the network does not explode. It starts with the contention that similar tasks should have the same number of total network resources, to ensure fair representation of all tasks in a continual learning scenario. Thus, as the data for new task streams in, sufficient neurons are added to the network such that the total number of neurons in each layer of the network, including the shared representations with previous tasks and individual task related representation, are equal for all tasks. The weights that are redundant at the end of training each task are also pruned through re-initialization, in order to be efficiently utilized in the subsequent task. Thus, the network grows progressively, but ensures effective utilization of network resources. We refer to our proposed method as 'Robust Continual Learning through a Comprehensively Progressive Bayesian Neural Network (RCL-CPB)' and evaluate the proposed approach on the MNIST data set, under three different continual learning scenarios. Further to this, we evaluate the performance of RCL-CPB on a homogeneous sequence of tasks using split CIFAR100 (20 tasks of 5 classes each), and a heterogeneous sequence of tasks using MNIST, SVHN and CIFAR10 data sets. The demonstrations and the performance results show that the proposed strategies for progressive BNN enable robust continual learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题