论文标题
在有限随机游戏中产生的职业衡量
Occupation measures arising in finite stochastic games
论文作者
论文摘要
沙普利(Shapley,1953年)引入了两人零和折扣随机游戏,此后随机游戏,一个模型,一个型号遵循两个控制的马尔可夫连锁店,每个阶段的球员都会获得奖励,每个阶段加起来$ 0 $,每个阶段都将$ \ la $ \ la $ \ la $ dressive page page page $ \ l la $ dective page page page $ \ la paveers $ pavers $ 1 $ 1 $ for for for for for for for pageption $ \ la $ \ la $。我们研究这些游戏中产生的渐近职业措施,因为折现率达到$ 0 $。
Shapley (1953) introduced two-player zero-sum discounted stochastic games, henceforth stochastic games, a model where a state variable follows a two-controlled Markov chain, the players receive rewards at each stage which add up to $0$, and each maximizes the normalized $\la$-discounted sum of stage rewards, for some fixed discount rate $\la\in(0,1]$. In this paper, we study asymptotic occupation measures arising in these games, as the discount rate goes to $0$.