论文标题
近似政策迭代用于退出时间反馈控制问题,由随机微分方程驱动的使用张量火车格式
Approximative Policy Iteration for Exit Time Feedback Control Problems driven by Stochastic Differential Equations using Tensor Train format
论文作者
论文摘要
我们考虑一个随机的最佳退出时间反馈控制问题。通过一系列线性方程式,通过在多项式ANSATZ空间上的策略迭代算法近似地求解了Bellman方程。由于需要高度的多项式元素,因此即使在中等维度中,相应的方程式也会受到维数的诅咒。我们采用张量训练方法来解决此问题。策略迭代中的近似过程是通过最小二乘ANSATZ完成的,并且集成是通过Monte-Carlo方法完成的。给出了(多维)双井潜力和三孔电位的数值证据。
We consider a stochastic optimal exit time feedback control problem. The Bellman equation is solved approximatively via the Policy Iteration algorithm on a polynomial ansatz space by a sequence of linear equations. As high degree multi-polynomials are needed, the corresponding equations suffer from the curse of dimensionality even in moderate dimensions. We employ tensor-train methods to account for this problem. The approximation process within the Policy Iteration is done via a Least-Squares ansatz and the integration is done via Monte-Carlo methods. Numerical evidences are given for the (multi dimensional) double well potential and a three-hole potential.