将增强学习与模型预测控制结合起来，以进行坡道合并

论文标题

将增强学习与模型预测控制结合起来，以进行坡道合并

Combining Reinforcement Learning with Model Predictive Control for On-Ramp Merging

论文作者

Lubars, Joseph, Gupta, Harsh, Chinchali, Sandeep, Li, Liyun, Raja, Adnan, Srikant, R., Wu, Xinzhou

论文摘要

我们考虑设计算法以允许汽车自动合并到坡道上的高速公路上的问题。已经提出了两种广泛的技术来解决自主驾驶中的运动计划问题：模型预测控制（MPC）和增强学习（RL）。在本文中，我们首先通过模拟建立了最先进的MPC和基于RL的技术的优势和劣势。我们表明，从安全性和鲁棒性到分布外交通模式的角度来看，RL代理的性能比MPC解决方案的性能要差，即RL代理在训练过程中未见的交通模式。另一方面，在效率和乘客舒适度方面，RL代理的性能要比MPC解决方案的性能要好。随后，我们提出了一种算法，该算法将无模型RL代理与MPC解决方案融合在一起，并表明它在所有指标之间提供了更好的权衡 - 乘客舒适性，效率，撞车率和稳健性。

We consider the problem of designing an algorithm to allow a car to autonomously merge on to a highway from an on-ramp. Two broad classes of techniques have been proposed to solve motion planning problems in autonomous driving: Model Predictive Control (MPC) and Reinforcement Learning (RL). In this paper, we first establish the strengths and weaknesses of state-of-the-art MPC and RL-based techniques through simulations. We show that the performance of the RL agent is worse than that of the MPC solution from the perspective of safety and robustness to out-of-distribution traffic patterns, i.e., traffic patterns which were not seen by the RL agent during training. On the other hand, the performance of the RL agent is better than that of the MPC solution when it comes to efficiency and passenger comfort. We subsequently present an algorithm which blends the model-free RL agent with the MPC solution and show that it provides better trade-offs between all metrics -- passenger comfort, efficiency, crash rate and robustness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题