论文标题
通过学习动态的计划:通过Lipschitz常数确保安全性和可达到性的概率保证
Planning with Learned Dynamics: Probabilistic Guarantees on Safety and Reachability via Lipschitz Constants
论文作者
论文摘要
我们提出了一种反馈运动计划的方法,该系统具有未知动态的系统,可提供有关安全性,可及性和目标稳定性的概率保证。为了找到一个可以信任真正动力学的学习控制效果的域,我们估算了Lipschitz的真实动力学和学到的动力学之间差异的常数,并确保估计值在给定的概率上是有效的。只要系统至少具有与状态一样多的控件,我们还为单步反馈定律提供了存在条件,该法律可以将真实系统保持在使用学识渊博的动态计划的名义轨迹的小范围内。我们的方法在基于抽样的计划者中将反馈法的存在作为约束,该计划将围绕名义计划返回反馈政策,以确保如果Lipschitz恒定的估计有效,那么在计划执行过程中,真实的系统是安全的,并且达到目标,并且最终会在一个小计划中不变。我们通过计划使用6D四型和7DOF kuka臂的模型来证明我们的方法。我们表明,使用相同的学习动力学计划的基线而不考虑错误限制或反馈定律的存在可能无法围绕计划稳定并变得不安全。
We present a method for feedback motion planning of systems with unknown dynamics which provides probabilistic guarantees on safety, reachability, and goal stability. To find a domain in which a learned control-affine approximation of the true dynamics can be trusted, we estimate the Lipschitz constant of the difference between the true and learned dynamics, and ensure the estimate is valid with a given probability. Provided the system has at least as many controls as states, we also derive existence conditions for a one-step feedback law which can keep the real system within a small bound of a nominal trajectory planned with the learned dynamics. Our method imposes the feedback law existence as a constraint in a sampling-based planner, which returns a feedback policy around a nominal plan ensuring that, if the Lipschitz constant estimate is valid, the true system is safe during plan execution, reaches the goal, and is ultimately invariant in a small set about the goal. We demonstrate our approach by planning using learned models of a 6D quadrotor and a 7DOF Kuka arm. We show that a baseline which plans using the same learned dynamics without considering the error bound or the existence of the feedback law can fail to stabilize around the plan and become unsafe.