近似政策迭代用于退出时间反馈控制问题，由随机微分方程驱动的使用张量火车格式

论文标题

近似政策迭代用于退出时间反馈控制问题，由随机微分方程驱动的使用张量火车格式

Approximative Policy Iteration for Exit Time Feedback Control Problems driven by Stochastic Differential Equations using Tensor Train format

论文作者

Fackeldey, Konstantin, Oster, Mathias, Sallandt, Leon, Schneider, Reinhold

论文摘要

我们考虑一个随机的最佳退出时间反馈控制问题。通过一系列线性方程式，通过在多项式ANSATZ空间上的策略迭代算法近似地求解了Bellman方程。由于需要高度的多项式元素，因此即使在中等维度中，相应的方程式也会受到维数的诅咒。我们采用张量训练方法来解决此问题。策略迭代中的近似过程是通过最小二乘ANSATZ完成的，并且集成是通过Monte-Carlo方法完成的。给出了（多维）双井潜力和三孔电位的数值证据。

We consider a stochastic optimal exit time feedback control problem. The Bellman equation is solved approximatively via the Policy Iteration algorithm on a polynomial ansatz space by a sequence of linear equations. As high degree multi-polynomials are needed, the corresponding equations suffer from the curse of dimensionality even in moderate dimensions. We employ tensor-train methods to account for this problem. The approximation process within the Policy Iteration is done via a Least-Squares ansatz and the integration is done via Monte-Carlo methods. Numerical evidences are given for the (multi dimensional) double well potential and a three-hole potential.

下载PDF全文

下载文献需遵守相关版权规定

论文标题