论文标题
增强最佳控制
Reinforced optimal control
论文作者
论文摘要
最小二乘蒙特卡洛方法是一种流行的数值近似方法,用于解决随机控制问题。基于动态编程,其关键特征是通过线性最小二乘回归对未来奖励的条件期望的近似。因此,基本函数的选择对于该方法的准确性至关重要。我们中一些人的早期工作[Belomestny,Schoenmakers,Spokoiny,Zharkynbay。 Commun。〜Math。〜sci。,18(1):109-121,2020](Arxiv:1808.02341)提议在以后的时间已经计算出的最佳停止价值功能的情况下,在最佳停止问题的情况下加强基础函数,从而可以通过有限的额外计算成本有限地提高准确性。我们将加强回归方法扩展到一般的随机控制问题类别,同时大大提高了该方法的效率,如大量的数值示例和理论分析所证明的那样。
Least squares Monte Carlo methods are a popular numerical approximation method for solving stochastic control problems. Based on dynamic programming, their key feature is the approximation of the conditional expectation of future rewards by linear least squares regression. Hence, the choice of basis functions is crucial for the accuracy of the method. Earlier work by some of us [Belomestny, Schoenmakers, Spokoiny, Zharkynbay. Commun.~Math.~Sci., 18(1):109-121, 2020](arXiv:1808.02341) proposes to reinforce the basis functions in the case of optimal stopping problems by already computed value functions for later times, thereby considerably improving the accuracy with limited additional computational cost. We extend the reinforced regression method to a general class of stochastic control problems, while considerably improving the method's efficiency, as demonstrated by substantial numerical examples as well as theoretical analysis.