机器人运动计划和控制的可区分典型的模仿学习

论文标题

机器人运动计划和控制的可区分典型的模仿学习

Differentiable Constrained Imitation Learning for Robot Motion Planning and Control

论文作者

Diehl, Christopher, Adamek, Janis, Krüger, Martin, Hoffmann, Frank, Bertram, Torsten

论文摘要

运动计划和控制是自动驾驶等机器人应用的关键组件。在这里，时空硬性约束（例如系统动力学和安全边界（例如障碍））限制了机器人的动作。最佳控制的直接方法解决了受约束的优化问题。但是，在许多应用程序中，找到适当的成本函数在本质上是困难的，因为部分相互冲突的目标的加权。另一方面，模仿学习（IL）方法（例如行为克隆（BC））为从离线演示学习决策提供了一个直观的框架，并构成了复杂机器人应用中计划和控制的有希望的途径。先前的工作主要依赖于软约束方法，该方法使用了描述约束的其他辅助损失术语。但是，在分布（OOD）场景中可能发生灾难性的安全性失败。这项工作将IL的灵活性与最佳控制中的硬约束处理相结合。我们的方法构成了约束机器人运动计划和控制以及交通代理模拟的一般框架，而我们专注于移动机器人和自动驾驶应用程序。通过明确的完成和基于梯度的校正，硬约束以可区分的方式集成到学习问题中。移动机器人导航和自动驾驶的模拟实验为提出方法的性能提供了证据。

Motion planning and control are crucial components of robotics applications like automated driving. Here, spatio-temporal hard constraints like system dynamics and safety boundaries (e.g., obstacles) restrict the robot's motions. Direct methods from optimal control solve a constrained optimization problem. However, in many applications finding a proper cost function is inherently difficult because of the weighting of partially conflicting objectives. On the other hand, Imitation Learning (IL) methods such as Behavior Cloning (BC) provide an intuitive framework for learning decision-making from offline demonstrations and constitute a promising avenue for planning and control in complex robot applications. Prior work primarily relied on soft constraint approaches, which use additional auxiliary loss terms describing the constraints. However, catastrophic safety-critical failures might occur in out-of-distribution (OOD) scenarios. This work integrates the flexibility of IL with hard constraint handling in optimal control. Our approach constitutes a general framework for constraint robotic motion planning and control, as well as traffic agent simulation, whereas we focus on mobile robot and automated driving applications. Hard constraints are integrated into the learning problem in a differentiable manner, via explicit completion and gradient-based correction. Simulated experiments of mobile robot navigation and automated driving provide evidence for the performance of the proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题