论文标题
通过直接训练的深尖峰Q-NETWORKS的人类水平控制
Human-Level Control through Directly-Trained Deep Spiking Q-Networks
论文作者
论文摘要
作为第三代神经网络,尖峰神经网络(SNN)在神经形态硬件上具有很大的潜力,因为它们的能效高。但是,由于二进制输出和峰值函数的非差异性能,深层尖峰增强学习(DSRL),即基于SNN的增强学习(RL)仍处于其初步阶段。为了解决这些问题,我们在本文中提出了深层尖峰Q-Network(DSQN)。具体而言,我们提出了基于泄漏的集成和火(LIF)神经元和深Q-NETWORK(DQN)的直接训练的深尖峰增强型学习体系结构。然后,我们适应了深层尖峰Q网络的直接尖峰学习算法。我们进一步证明了在理论上使用DSQN中使用LIF神经元的优势。已经对17场表现最佳的Atari游戏进行了全面的实验,以将我们的方法与最先进的转换方法进行比较。实验结果证明了我们方法在性能,稳定性,鲁棒性和能源效率方面的优势。据我们所知,我们的工作是第一个通过直接训练的SNN在多个Atari游戏中实现最先进的表现的工作。
As the third-generation neural networks, Spiking Neural Networks (SNNs) have great potential on neuromorphic hardware because of their high energy-efficiency. However, Deep Spiking Reinforcement Learning (DSRL), i.e., the Reinforcement Learning (RL) based on SNNs, is still in its preliminary stage due to the binary output and the non-differentiable property of the spiking function. To address these issues, we propose a Deep Spiking Q-Network (DSQN) in this paper. Specifically, we propose a directly-trained deep spiking reinforcement learning architecture based on the Leaky Integrate-and-Fire (LIF) neurons and Deep Q-Network (DQN). Then, we adapt a direct spiking learning algorithm for the Deep Spiking Q-Network. We further demonstrate the advantages of using LIF neurons in DSQN theoretically. Comprehensive experiments have been conducted on 17 top-performing Atari games to compare our method with the state-of-the-art conversion method. The experimental results demonstrate the superiority of our method in terms of performance, stability, robustness and energy-efficiency. To the best of our knowledge, our work is the first one to achieve state-of-the-art performance on multiple Atari games with the directly-trained SNN.