与马尔可夫决策过程切换的切换线性系统的策略合成

论文标题

与马尔可夫决策过程切换的切换线性系统的策略合成

Policy Synthesis for Switched Linear Systems with Markov Decision Process Switching

论文作者

Wu, Bo, Cubuktepe, Murat, Djeumou, Franck, Xu, Zhe, Topcu, Ufuk

论文摘要

我们研究了模式切换协议的合成，用于一类离散时间开关的线性系统，其中模式跳跃由马尔可夫决策过程（MDP）控制。我们称此类系统MDP-JL为简洁。 MDP的每个状态对应于交换系统中的模式。 MDP中的概率状态转变表示模式转换。我们专注于找到在每种模式下选择切换操作的策略，以便保证遵循这些操作的切换系统稳定。鉴于MDP中的策略，所考虑的MDP-JLS将减少到马尔可夫跳跃线性系统（MJLS）。 {我们认为均方根稳定性和稳定性具有概率。对于均方根稳定性，我们利用MJLSS的现有稳定性条件，并提出有效的半决赛编程公式，以在MDP中找到稳定的策略。对于概率上的稳定性，我们得出了新的足够条件，并使用线性编程计算稳定策略。我们还将策略合成结果扩展到具有不确定模式过渡概率的MDP-JL。

We study the synthesis of mode switching protocols for a class of discrete-time switched linear systems in which the mode jumps are governed by Markov decision processes (MDPs). We call such systems MDP-JLS for brevity. Each state of the MDP corresponds to a mode in the switched system. The probabilistic state transitions in the MDP represent the mode transitions. We focus on finding a policy that selects the switching actions at each mode such that the switched system that follows these actions is guaranteed to be stable. Given a policy in the MDP, the considered MDP-JLS reduces to a Markov jump linear system (MJLS). {We consider both mean-square stability and stability with probability one. For mean-square stability, we leverage existing stability conditions for MJLSs and propose efficient semidefinite programming formulations to find a stabilizing policy in the MDP. For stability with probability one, we derive new sufficient conditions and compute a stabilizing policy using linear programming. We also extend the policy synthesis results to MDP-JLS with uncertain mode transition probabilities.

下载PDF全文

下载文献需遵守相关版权规定

论文标题