用于工业波能量转换器的多试剂增强学习控制器的跳过培训

论文标题

用于工业波能量转换器的多试剂增强学习控制器的跳过培训

Skip Training for Multi-Agent Reinforcement Learning Controller for Industrial Wave Energy Converters

论文作者

Sarkar, Soumyendu, Gundecha, Vineet, Ghorbanpour, Sahand, Shmakov, Alexander, Babu, Ashwin Ramesh, Pichard, Alexandre, Cocho, Mathieu

论文摘要

最近的波能转化器（WEC）配备了多个腿和发电机，以最大程度地发电。传统控制器显示出捕获复杂波形模式的局限性，并且控制器必须有效地最大程度地捕获能量捕获。本文介绍了多项式增强学习控制器（MARL），该控制器的表现优于传统使用的弹簧减震器控制器。我们的最初研究表明，问题的复杂性质使训练很难融合。因此，我们提出了一种新颖的跳过训练方法，该方法使MARL训练能够克服性能饱和度，并与默认的MARL训练相比，融合到更最佳的控制器，从而增强发电。我们还提出了另一种新型的混合训练初始化（STHTI）方法，其中最初可以单独针对基线弹簧减震器（SD）控制器对MARL控制器的个别试剂进行训练，然后在将来的迭代中一次或一起对一个代理进行培训，以便加速加速。我们使用异步参与者-Critic（A3C）算法的拟议MARL控制器（A3C）算法实现了基线弹簧减震器控制器的能源效率的两位数提高。

Recent Wave Energy Converters (WEC) are equipped with multiple legs and generators to maximize energy generation. Traditional controllers have shown limitations to capture complex wave patterns and the controllers must efficiently maximize the energy capture. This paper introduces a Multi-Agent Reinforcement Learning controller (MARL), which outperforms the traditionally used spring damper controller. Our initial studies show that the complex nature of problems makes it hard for training to converge. Hence, we propose a novel skip training approach which enables the MARL training to overcome performance saturation and converge to more optimum controllers compared to default MARL training, boosting power generation. We also present another novel hybrid training initialization (STHTI) approach, where the individual agents of the MARL controllers can be initially trained against the baseline Spring Damper (SD) controller individually and then be trained one agent at a time or all together in future iterations to accelerate convergence. We achieved double-digit gains in energy efficiency over the baseline Spring Damper controller with the proposed MARL controllers using the Asynchronous Advantage Actor-Critic (A3C) algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题