论文标题

与普通噪声的平均野战马尔可夫决策过程的混乱定量传播

Quantitative propagation of chaos for mean field Markov decision process with common noise

论文作者

Motte, Médéric, Pham, Huyên

论文摘要

我们研究了混乱的平均野马马尔可夫决策过程(CMKV-MDP)的繁殖,以及在无限层上的随机开环控制上进行优化时。我们首先说明$ m_n^γ$的收敛速率,其中$ m_n $是瓦斯施泰因距离的平均收敛速率,经验度量的$γ\ in(0,1] $是一个明确的常数,在$ n $ agagent控制问题的限制下,在$ n $ agagent控制问题的限制下,与无效的Open-loop Controls to the whow to to to to to to to to to to to to to to fortials fortialscm。 explicitly construct $(ε+\mathcal{O}(M_N^γ))$-optimal policies for the $N$-agent model from $ε$-optimal policies for the CMKV-MDP. Our approach relies on sharp comparison between the Bellman operators in the $N$-agent problem and the CMKV-MDP, and fine coupling of empirical measures.

We investigate propagation of chaos for mean field Markov Decision Process with common noise (CMKV-MDP), and when the optimization is performed over randomized open-loop controls on infinite horizon. We first state a rate of convergence of order $M_N^γ$, where $M_N$ is the mean rate of convergence in Wasserstein distance of the empirical measure, and $γ\in (0,1]$ is an explicit constant, in the limit of the value functions of $N$-agent control problem with asymmetric open-loop controls, towards the value function of CMKV-MDP. Furthermore, we show how to explicitly construct $(ε+\mathcal{O}(M_N^γ))$-optimal policies for the $N$-agent model from $ε$-optimal policies for the CMKV-MDP. Our approach relies on sharp comparison between the Bellman operators in the $N$-agent problem and the CMKV-MDP, and fine coupling of empirical measures.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源