COOR-PLT：一种基于深度加固学习的无信号交叉点上连接和自动驾驶汽车的自适应排的分层控制模型

论文标题

COOR-PLT：一种基于深度加固学习的无信号交叉点上连接和自动驾驶汽车的自适应排的分层控制模型

COOR-PLT: A hierarchical control model for coordinating adaptive platoons of connected and autonomous vehicles at signal-free intersections based on deep reinforcement learning

论文作者

Li, Duowei, Wu, Jianping, Zhu, Feng, Chen, Tianyi, Wong, Yiik Diew

论文摘要

排和协调是两种实施策略，这些策略通常是针对无信号交叉点的连接和自动驾驶汽车（CAV）的交通控制而不是使用常规交通信号的两种实施策略。但是，很少有研究试图整合两种策略，以更好地促进无信号交叉点的CAV控制。为此，这项研究提出了一个名为COOR-PLT的分层控制模型，以基于深度加固学习（DRL）的无信号交叉路口协调适应性的CAV排。 Coor-Plt具有两层框架。第一层使用集中式控制策略形成自适应排。每个排的最佳大小是通过考虑多个目标（即效率，公平和节能）来确定的。第二层采用分散的控制策略来协调通过交叉路口的多个排。每个排都标有协调状态或独立状态，并确定其通过优先级。作为有效的DRL算法，采用深Q网络（DQN）来分别确定两层中的排大小和通过优先级。该模型在城市移动性（SUMO）的模拟器模拟上进行了验证和检查。仿真结果表明该模型能够：（1）实现令人满意的收敛性能；（2）响应不同的交通状况而自适应地确定排尺寸；（3）在十字路口完全避免僵局。通过与其他控制方法进行比较，该模型表现出其采用自适应排和基于DRL的协调策略的优势。此外，该模型的表现优于减少不同交通条件下的旅行时间和燃油消耗的几种最新方法。

Platooning and coordination are two implementation strategies that are frequently proposed for traffic control of connected and autonomous vehicles (CAVs) at signal-free intersections instead of using conventional traffic signals. However, few studies have attempted to integrate both strategies to better facilitate the CAV control at signal-free intersections. To this end, this study proposes a hierarchical control model, named COOR-PLT, to coordinate adaptive CAV platoons at a signal-free intersection based on deep reinforcement learning (DRL). COOR-PLT has a two-layer framework. The first layer uses a centralized control strategy to form adaptive platoons. The optimal size of each platoon is determined by considering multiple objectives (i.e., efficiency, fairness and energy saving). The second layer employs a decentralized control strategy to coordinate multiple platoons passing through the intersection. Each platoon is labeled with coordinated status or independent status, upon which its passing priority is determined. As an efficient DRL algorithm, Deep Q-network (DQN) is adopted to determine platoon sizes and passing priorities respectively in the two layers. The model is validated and examined on the simulator Simulation of Urban Mobility (SUMO). The simulation results demonstrate that the model is able to: (1) achieve satisfactory convergence performances; (2) adaptively determine platoon size in response to varying traffic conditions; and (3) completely avoid deadlocks at the intersection. By comparison with other control methods, the model manifests its superiority of adopting adaptive platooning and DRL-based coordination strategies. Also, the model outperforms several state-of-the-art methods on reducing travel time and fuel consumption in different traffic conditions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题