论文标题

比较标记的马尔可夫决策过程

Comparing Labelled Markov Decision Processes

论文作者

Kiefer, Stefan, Tang, Qiyi

论文摘要

标记的马尔可夫决策过程是一个标有非确定性的马尔可夫链,即,与标记为MDP的策略一起引起了标记的马尔可夫链。该模型与马尔可夫链间隔有关。通过对等效检查验证匿名性的应用的激励,我们研究了两个标记的MDP的算法比较,特别是在痕量等价方面,是否存在MDP的策略,以使MDP成为等效性和不等性。我们提供了第一种多项式时间算法,用于计算无内存策略,以使两个标记为MDP的MDP不等,如果存在此类策略。我们还研究了定性问题的计算复杂性,以使总变异距离和小于一个或等于一个的概率差异距离。

A labelled Markov decision process is a labelled Markov chain with nondeterminism, i.e., together with a strategy a labelled MDP induces a labelled Markov chain. The model is related to interval Markov chains. Motivated by applications of equivalence checking for the verification of anonymity, we study the algorithmic comparison of two labelled MDPs, in particular, whether there exist strategies such that the MDPs become equivalent/inequivalent, both in terms of trace equivalence and in terms of probabilistic bisimilarity. We provide the first polynomial-time algorithms for computing memoryless strategies to make the two labelled MDPs inequivalent if such strategies exist. We also study the computational complexity of qualitative problems about making the total variation distance and the probabilistic bisimilarity distance less than one or equal to one.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源