论文标题

在有限随机游戏中产生的职业衡量

Occupation measures arising in finite stochastic games

论文作者

Jaffuel, Bruno, Oliu-Barton, Miquel

论文摘要

沙普利(Shapley,1953年)引入了两人零和折扣随机游戏,此后随机游戏,一个模型,一个型号遵循两个控制的马尔可夫连锁店,每个阶段的球员都会获得奖励,每个阶段加起来$ 0 $,每个阶段都将$ \ la $ \ la $ \ la $ dressive page page page $ \ l la $ dective page page page $ \ la paveers $ pavers $ 1 $ 1 $ for for for for for for for pageption $ \ la $ \ la $。我们研究这些游戏中产生的渐近职业措施,因为折现率达到$ 0 $。

Shapley (1953) introduced two-player zero-sum discounted stochastic games, henceforth stochastic games, a model where a state variable follows a two-controlled Markov chain, the players receive rewards at each stage which add up to $0$, and each maximizes the normalized $\la$-discounted sum of stage rewards, for some fixed discount rate $\la\in(0,1]$. In this paper, we study asymptotic occupation measures arising in these games, as the discount rate goes to $0$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源