论文标题

在HARQ下,多源AOI受限的资源最小化:异质采样过程

Multi-Source AoI-Constrained Resource Minimization under HARQ: Heterogeneous Sampling Processes

论文作者

Vilni, Saeid Sadeghi, Moltafet, Mohammad, Leinonen, Markus, Codreanu, Marian

论文摘要

我们考虑一个基于多源混合自动重复请求(HARQ)系统,其中发射器将随机到达的状态更新数据包(即无法控制的采样)发送,并通过一个错误的通道向目的地生成atwill(即可控采样)。我们制定传输计划策略,以最大程度地减少受到平均信息(AOI)约束的平均传输次数。首先,我们考虑已知的环境(即已知的系统统计),并制定近乎最佳的确定性传输策略和低复杂的动态传输(LC-DT)策略。前者策略是通过将主要问题投入受约束的马尔可夫决策过程(CMDP)问题来得出的,该问题随后使用拉格朗日放松,相对价值迭代算法和分配来解决。 LC-DT策略是通过将主要问题转换为一系列每插槽问题的序列来通过漂移加铅(DPP)方法开发的。最后,我们考虑未知环境,并通过使用DPP方法将CMDP问题放松到MDP问题,然后采用深Q学习算法来设计基于学习的传输策略。数值结果表明,所提出的政策实现了近乎最佳的性能,并说明了HARQ在状态更新中的好处。

We consider a multi-source hybrid automatic repeat request (HARQ) based system, where a transmitter sends status update packets of random arrival (i.e., uncontrollable sampling) and generate-atwill (i.e., controllable sampling) sources to a destination through an error-prone channel. We develop transmission scheduling policies to minimize the average number of transmissions subject to an average age of information (AoI) constraint. First, we consider known environment (i.e., known system statistics) and develop a near-optimal deterministic transmission policy and a low-complexity dynamic transmission (LC-DT) policy. The former policy is derived by casting the main problem into a constrained Markov decision process (CMDP) problem, which is then solved using the Lagrangian relaxation, relative value iteration algorithm, and bisection. The LC-DT policy is developed via the drift-plus-penalty (DPP) method by transforming the main problem into a sequence of per-slot problems. Finally, we consider unknown environment and devise a learning-based transmission policy by relaxing the CMDP problem into an MDP problem using the DPP method and then adopting the deep Q-learning algorithm. Numerical results show that the proposed policies achieve near-optimal performance and illustrate the benefits of HARQ in status updating.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源