论文标题
隐式循环网络:一种新型的固定输入处理方法,并在深度学习中复发神经网络
Implicit recurrent networks: A novel approach to stationary input processing with recurrent neural networks in deep learning
论文作者
论文摘要
已知脑皮质在大脑中处理视觉,听觉和感官数据,已知在其层内有许多复发连接,并且从较高到较低的层具有许多复发性连接。但是,在使用神经网络的机器学习的情况下,通常假定严格的馈送架构适用于静态输入数据,例如图像,而重复网络主要用于处理顺序输入(例如语言)。但是,尚不清楚是否还可以处理静态输入数据的复发连接性。在这项工作中,我们介绍并测试了带有侧向和远距离连接的复发性神经网络的新颖实现,以深入学习。与严格的前馈结构偏离这种偏离,可防止使用标准误差反向传播算法来训练网络。因此,我们提供了一种算法,该算法在反复网络的隐式实现上实现了反向传播算法,该算法与经常性神经网络的最新实现不同。与当前的复发性神经网络相比,我们的方法由于许多迭代更新步骤而消除了长长的衍生物链,从而使学习的计算成本降低。事实证明,单层隐式反复网络中反复的层内连接的存在大大增强了神经网络的性能:单层隐式重复网络能够解决XOR问题,而单调增加的馈电网络在此任务中会单调增加激活功能。最后,我们证明了两层隐式复发体系结构从测量的阻尼摆的轨迹的物理参数的回归任务中提高了更好的性能。
The brain cortex, which processes visual, auditory and sensory data in the brain, is known to have many recurrent connections within its layers and from higher to lower layers. But, in the case of machine learning with neural networks, it is generally assumed that strict feed-forward architectures are suitable for static input data, such as images, whereas recurrent networks are required mainly for the processing of sequential input, such as language. However, it is not clear whether also processing of static input data benefits from recurrent connectivity. In this work, we introduce and test a novel implementation of recurrent neural networks with lateral and feed-back connections into deep learning. This departure from the strict feed-forward structure prevents the use of the standard error backpropagation algorithm for training the networks. Therefore we provide an algorithm which implements the backpropagation algorithm on a implicit implementation of recurrent networks, which is different from state-of-the-art implementations of recurrent neural networks. Our method, in contrast to current recurrent neural networks, eliminates the use of long chains of derivatives due to many iterative update steps, which makes learning computationally less costly. It turns out that the presence of recurrent intra-layer connections within a one-layer implicit recurrent network enhances the performance of neural networks considerably: A single-layer implicit recurrent network is able to solve the XOR problem, while a feed-forward network with monotonically increasing activation function fails at this task. Finally, we demonstrate that a two-layer implicit recurrent architecture leads to a better performance in a regression task of physical parameters from the measured trajectory of a damped pendulum.