通过直接训练的深尖峰Q-NETWORKS的人类水平控制

论文标题

通过直接训练的深尖峰Q-NETWORKS的人类水平控制

Human-Level Control through Directly-Trained Deep Spiking Q-Networks

论文作者

Liu, Guisong, Deng, Wenjie, Xie, Xiurui, Huang, Li, Tang, Huajin

论文摘要

作为第三代神经网络，尖峰神经网络（SNN）在神经形态硬件上具有很大的潜力，因为它们的能效高。但是，由于二进制输出和峰值函数的非差异性能，深层尖峰增强学习（DSRL），即基于SNN的增强学习（RL）仍处于其初步阶段。为了解决这些问题，我们在本文中提出了深层尖峰Q-Network（DSQN）。具体而言，我们提出了基于泄漏的集成和火（LIF）神经元和深Q-NETWORK（DQN）的直接训练的深尖峰增强型学习体系结构。然后，我们适应了深层尖峰Q网络的直接尖峰学习算法。我们进一步证明了在理论上使用DSQN中使用LIF神经元的优势。已经对17场表现最佳的Atari游戏进行了全面的实验，以将我们的方法与最先进的转换方法进行比较。实验结果证明了我们方法在性能，稳定性，鲁棒性和能源效率方面的优势。据我们所知，我们的工作是第一个通过直接训练的SNN在多个Atari游戏中实现最先进的表现的工作。

As the third-generation neural networks, Spiking Neural Networks (SNNs) have great potential on neuromorphic hardware because of their high energy-efficiency. However, Deep Spiking Reinforcement Learning (DSRL), i.e., the Reinforcement Learning (RL) based on SNNs, is still in its preliminary stage due to the binary output and the non-differentiable property of the spiking function. To address these issues, we propose a Deep Spiking Q-Network (DSQN) in this paper. Specifically, we propose a directly-trained deep spiking reinforcement learning architecture based on the Leaky Integrate-and-Fire (LIF) neurons and Deep Q-Network (DQN). Then, we adapt a direct spiking learning algorithm for the Deep Spiking Q-Network. We further demonstrate the advantages of using LIF neurons in DSQN theoretically. Comprehensive experiments have been conducted on 17 top-performing Atari games to compare our method with the state-of-the-art conversion method. The experimental results demonstrate the superiority of our method in terms of performance, stability, robustness and energy-efficiency. To the best of our knowledge, our work is the first one to achieve state-of-the-art performance on multiple Atari games with the directly-trained SNN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题