论文标题
学习成本:基于学习的RRM的效率与6G的功效
The Cost of Learning: Efficiency vs. Efficacy of Learning-Based RRM for 6G
论文作者
论文摘要
在过去的几年中,深度强化学习(DRL)已成为自动学习复杂网络中有效资源管理策略的宝贵解决方案。在许多情况下,学习任务是在云中执行的,而体验样本是由Edge节点或用户直接生成的。因此,学习任务涉及一些数据交换,进而减去系统中的一定数量的传输资源。这在需要加快融合到有效策略的需求之间产生了摩擦,这需要分配资源来传输学习样本,以及最大程度地提高用于数据平面通信的资源量,最大化用户的服务质量(QOS),这需要学习过程有效,即最大程度地减少其上间的范围。在本文中,我们调查了这一权衡,并提出了学习平面之间的动态平衡策略,这使集中式学习代理可以快速收敛到有效的资源分配策略,同时最大程度地减少对QoS的影响。仿真结果表明,从长远来看,所提出的方法比静态分配方法优于静态分配方法(即学习平面的最大疗效和最小开销)。
In the past few years, Deep Reinforcement Learning (DRL) has become a valuable solution to automatically learn efficient resource management strategies in complex networks. In many scenarios, the learning task is performed in the Cloud, while experience samples are generated directly by edge nodes or users. Therefore, the learning task involves some data exchange which, in turn, subtracts a certain amount of transmission resources from the system. This creates a friction between the need to speed up convergence towards an effective strategy, which requires the allocation of resources to transmit learning samples, and the need to maximize the amount of resources used for data plane communication, maximizing users' Quality of Service (QoS), which requires the learning process to be efficient, i.e., minimize its overhead. In this paper, we investigate this trade-off and propose a dynamic balancing strategy between the learning and data planes, which allows the centralized learning agent to quickly converge to an efficient resource allocation strategy while minimizing the impact on QoS. Simulation results show that the proposed method outperforms static allocation methods, converging to the optimal policy (i.e., maximum efficacy and minimum overhead of the learning plane) in the long run.