分布式和不协调的认知无线电资源分配的深度强化学习

论文标题

分布式和不协调的认知无线电资源分配的深度强化学习

Deep Reinforcement Learning for Distributed and Uncoordinated Cognitive Radios Resource Allocation

论文作者

Tondwalkar, Ankita, Kwasinski, Andres

论文摘要

本文介绍了一种新型的深入增强学习资源分配技术，该技术针对由认知无线网络提出的多区域环境提出，在学习过程中，代理商在学习过程中的相互作用可能会导致非平稳的环境。这项工作中介绍的资源分配技术是分发的，不需要与其他代理商进行协调。通过考虑对深度强化学习的特定方面来表明，所提出的算法在任意长时间内收敛于非平稳的多学位环境中的均衡策略，这是由于无线电之间通过共享无线环境在无线电之间的不协调动态交互而产生的。仿真结果表明，与基于表格的Q学习算法相比，提出的技术可以实现更快的学习绩效，并且能够在99％的情况下在足够长的学习时间内找到最佳策略。此外，模拟表明，我们的DQL方法需要的学习步骤数量不到一半，以实现与基于等效表的实现相同的性能。此外，结果表明，使用标准的单药深钢筋学习方法可能无法达到不协调的相互作用的多拉迪奥场景

This paper presents a novel deep reinforcement learning-based resource allocation technique for the multi-agent environment presented by a cognitive radio network where the interactions of the agents during learning may lead to a non-stationary environment. The resource allocation technique presented in this work is distributed, not requiring coordination with other agents. It is shown by considering aspects specific to deep reinforcement learning that the presented algorithm converges in an arbitrarily long time to equilibrium policies in a non-stationary multi-agent environment that results from the uncoordinated dynamic interaction between radios through the shared wireless environment. Simulation results show that the presented technique achieves a faster learning performance compared to an equivalent table-based Q-learning algorithm and is able to find the optimal policy in 99% of cases for a sufficiently long learning time. In addition, simulations show that our DQL approach requires less than half the number of learning steps to achieve the same performance as an equivalent table-based implementation. Moreover, it is shown that the use of a standard single-agent deep reinforcement learning approach may not achieve convergence when used in an uncoordinated interacting multi-radio scenario

下载PDF全文

下载文献需遵守相关版权规定

论文标题