论文标题
元学习保证在线退缩学习控制
Meta-Learning Guarantees for Online Receding Horizon Learning Control
论文作者
论文摘要
在本文中,我们为在迭代控制设置中的在线元学习退回地平线控制算法提供了可证明的遗憾保证。我们考虑在每次迭代中,要控制的系统是一个线性确定性系统,该系统是不同且未知的线性确定性系统,迭代中控制器的成本是一般的附加成本函数,并且有仿射控制输入约束。通过分析可以实现次线性遗憾的条件,我们证明,元学习的在线回收地平线控制器可以平均达到控制器成本的动态遗憾,即$ \ tilde {o}(((1+1/\ sqrt {n} {n} {n})t^{3/4})因此,我们表明,在迭代中学习的最糟糕的遗憾会随着更多的迭代经验的改善,并保证了改善速度。
In this paper we provide provable regret guarantees for an online meta-learning receding horizon control algorithm in an iterative control setting. We consider the setting where, in each iteration the system to be controlled is a linear deterministic system that is different and unknown, the cost for the controller in an iteration is a general additive cost function and there are affine control input constraints. By analysing conditions under which sub-linear regret is achievable, we prove that the meta-learning online receding horizon controller achieves an average of the dynamic regret for the controller cost that is $\tilde{O}((1+1/\sqrt{N})T^{3/4})$ with the number of iterations $N$. Thus, we show that the worst regret for learning within an iteration improves with experience of more iterations, with guarantee on rate of improvement.