结构增强的DRL，用于最佳传输计划

论文标题

结构增强的DRL，用于最佳传输计划

Structure-Enhanced DRL for Optimal Transmission Scheduling

论文作者

Chen, Jiazheng, Liu, Wanchun, Quevedo, Daniel E., Khosravirad, Saeed R., Li, Yonghui, Vucetic, Branka

论文摘要

大规模分布式动态过程的远程状态估计在行业4.0应用中起着重要作用。在本文中，我们关注远程估计系统的传输调度问题。首先，我们在褪色渠道上得出了最佳传感器调度策略的某些结构属性。然后，在这些理论指南的基础上，我们开发了一个结构增强的深入增强学习（DRL）框架，以最佳的系统调度，以达到最低总体估计均值均衡错误（MSE）。特别是，我们提出了一种结构增强的动作选择方法，该方法倾向于选择遵守政策结构的行动。这更有效地探索了动作空间，并提高了DRL药物的学习效率。此外，我们引入了一个结构增强的损失函数，以增加不遵循策略结构的行动。新的损失函数指导DRL快速收敛到最佳策略结构。我们的数值实验表明，与基准DRL算法相比，提出的结构增强的DRL算法可以节省训练时间50％，并将远程估计MSE降低10％至25％。此外，我们表明，派生的结构属性存在于广泛的动态调度问题中，这些问题超出了远程状态估计。

Remote state estimation of large-scale distributed dynamic processes plays an important role in Industry 4.0 applications. In this paper, we focus on the transmission scheduling problem of a remote estimation system. First, we derive some structural properties of the optimal sensor scheduling policy over fading channels. Then, building on these theoretical guidelines, we develop a structure-enhanced deep reinforcement learning (DRL) framework for optimal scheduling of the system to achieve the minimum overall estimation mean-square error (MSE). In particular, we propose a structure-enhanced action selection method, which tends to select actions that obey the policy structure. This explores the action space more effectively and enhances the learning efficiency of DRL agents. Furthermore, we introduce a structure-enhanced loss function to add penalties to actions that do not follow the policy structure. The new loss function guides the DRL to converge to the optimal policy structure quickly. Our numerical experiments illustrate that the proposed structure-enhanced DRL algorithms can save the training time by 50% and reduce the remote estimation MSE by 10% to 25% when compared to benchmark DRL algorithms. In addition, we show that the derived structural properties exist in a wide range of dynamic scheduling problems that go beyond remote state estimation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题