体重初始化和体重遗传对神经进化的影响的实验研究

论文标题

体重初始化和体重遗传对神经进化的影响的实验研究

An Experimental Study of Weight Initialization and Weight Inheritance Effects on Neuroevolution

论文作者

Lyu, Zimeng, ElSaid, AbdElRahman, Karns, Joshua, Mkaouer, Mohamed, Desell, Travis

论文摘要

重量初始化对于能够成功训练人工神经网络（ANN）至关重要，而对于复发性神经网络（RNN）来说，重量初始化至关重要。在神经进化中，将进化算法应用于神经体系结构搜索时，通常需要在三个不同的时间初始化权重：当搜索开始时创建初始基因组（ANN架构）时，当后代基因组是由交叉产生的，以及在突变过程中创建新的节点或EDGE时。这项工作探讨了使用Xavier，Kaiming和均匀的随机重量初始化方法，以及新型的Lamarckian重量继承方法，用于在交叉和突变操作过程中初始化新的重量。使用对增强记忆模型（EXAMM）神经进化算法的进化探索进行检查，该算法能够与各种现代记忆细胞（例如LSTM，GRU，MGU，UGRNN和DELTA-RNN细胞）一起演变RNN，并且通过高绩效远距离的远距离浏览器的反复连接以及与各种相关的连接。结果表明，使用Lamarckian策略具有统计学意义，其表现优于Kaiming，Xavier和均匀的随机重量初始化，并且可以通过对每个生成的RNN进行更少的反向传播时期来加快神经进化的速度。

Weight initialization is critical in being able to successfully train artificial neural networks (ANNs), and even more so for recurrent neural networks (RNNs) which can easily suffer from vanishing and exploding gradients. In neuroevolution, where evolutionary algorithms are applied to neural architecture search, weights typically need to be initialized at three different times: when initial genomes (ANN architectures) are created at the beginning of the search, when offspring genomes are generated by crossover, and when new nodes or edges are created during mutation. This work explores the difference between using Xavier, Kaiming, and uniform random weight initialization methods, as well as novel Lamarckian weight inheritance methods for initializing new weights during crossover and mutation operations. These are examined using the Evolutionary eXploration of Augmenting Memory Models (EXAMM) neuroevolution algorithm, which is capable of evolving RNNs with a variety of modern memory cells (e.g., LSTM, GRU, MGU, UGRNN and Delta-RNN cells) as well recurrent connections with varying time skips through a high performance island based distributed evolutionary algorithm. Results show that with statistical significance, utilizing the Lamarckian strategies outperforms Kaiming, Xavier and uniform random weight initialization, and can speed neuroevolution by requiring less backpropagation epochs to be evaluated for each generated RNN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题