离散时间线性系统和Schrödinger桥的最大熵最佳密度控制

论文标题

离散时间线性系统和Schrödinger桥的最大熵最佳密度控制

Maximum entropy optimal density control of discrete-time linear systems and Schrödinger bridges

论文作者

Ito, Kaito, Kashima, Kenji

论文摘要

我们考虑确定性离散时间线性系统的最佳密度控制的熵调控版本。熵正则化或最大最佳控制方法的最大熵方法引起了很多关注，尤其是由于其许多优势，例如自然探索策略，因此引起了人们的关注。尽管有优点，正规化引起的高渗透控制政策仍将概率不确定性引入系统，这严重限制了最大最佳控制对安全至关重要系统的适用性。为了解决这种情况，我们在指定时间将高斯密度约束对最大最佳控制进行了直接控制状态不确定性。具体而言，我们得出了最大最佳密度控制的明确形式。此外，我们还考虑了密度约束被固定点约束代替的情况。然后，我们将关联的状态过程表征为固定过程，这是布朗桥对线性系统的概括。最后，我们揭示了最大最佳密度控制，使所谓的Schrödinger桥与离散时间线性系统相关联。

We consider an entropy-regularized version of optimal density control of deterministic discrete-time linear systems. Entropy regularization, or a maximum entropy (MaxEnt) method for optimal control has attracted much attention especially in reinforcement learning due to its many advantages such as a natural exploration strategy. Despite the merits, high-entropy control policies induced by the regularization introduce probabilistic uncertainty into systems, which severely limits the applicability of MaxEnt optimal control to safety-critical systems. To remedy this situation, we impose a Gaussian density constraint at a specified time on the MaxEnt optimal control to directly control state uncertainty. Specifically, we derive the explicit form of the MaxEnt optimal density control. In addition, we also consider the case where density constraints are replaced by fixed point constraints. Then, we characterize the associated state process as a pinned process, which is a generalization of the Brownian bridge to linear systems. Finally, we reveal that the MaxEnt optimal density control gives the so-called Schrödinger bridge associated to a discrete-time linear system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题