论文标题
学习最佳反馈操作员及其多项式近似
Learning Optimal Feedback Operators and their Polynomial Approximation
论文作者
论文摘要
提出了一种基于学习的方法,用于获得非线性最佳控制问题的反馈定律。提出了学习问题,使开放循环值函数是其最佳解决方案。 多项式ANSATZ近似这种无限的尺寸,功能空间,问题,并分析其收敛性。使用$ \ ell_1 $罚款,结合近端方法,可以找到有关学习问题的稀疏解决方案。该方法需要对多项式基础及其衍生物的要素进行多次评估。为了有效地执行此图,设计了图理论算法。几个示例强调了所提出的方法提供了一种有前途的方法来减轻维度的诅咒,如果通过解决汉密尔顿雅各比·贝尔曼方程获得最佳反馈定律,这将涉及到最佳反馈定律。
A learning based method for obtaining feedback laws for nonlinear optimal control problems is proposed. The learning problem is posed such that the open loop value function is its optimal solution. This infinite dimensional, function space, problem, is approximated by a polynomial ansatz and its convergence is analyzed. An $\ell_1$ penalty term is employed, which combined with the proximal point method, allows to find sparse solutions for the learning problem. The approach requires multiple evaluations of the elements of the polynomial basis and of their derivatives. In order to do this efficiently a graph-theoretic algorithm is devised. Several examples underline that the proposed methodology provides a promising approach for mitigating the curse of dimensionality which would be involved in case the optimal feedback law was obtained by solving the Hamilton Jacobi Bellman equation.