论文标题
具有加强学习的模因算法,用于社会技术生产计划
A Memetic Algorithm with Reinforcement Learning for Sociotechnical Production Scheduling
论文作者
论文摘要
以下跨学科文章提出了一种模因算法,并应用了深入的增强学习(DRL)来解决实际取向的双重资源限制的灵活的车间调度问题(DRC-FJSSP)。从行业的研究项目中,我们认识到需要考虑灵活的机器,灵活的人工,工人能力,设置和处理操作,物质到达时间,复杂的工作路径,具有材料清单(BOM)制造的平行任务,依赖序列的设置时间以及(部分地)在人机上的自动化任务。近年来,关于元启发式和DRL技术的广泛研究,但专注于简单的调度环境。但是,结合元启发式和DRL的方法很少,可以更可靠,有效地生成时间表。在本文中,我们首先制定了DRC-FJSSP,以绘制传统车间模型以外的复杂行业要求。然后,我们提出了一个计划框架,该框架集成了分散事件仿真(DES)进行时间表评估,以考虑并行计算和多标准优化。在这里,模因算法充满了DRL,以改善测序和分配决策。通过使用现实世界生产数据的数值实验,我们确认该框架可以有效,可靠地生成可行的时间表,以平衡Makepan(MS)(MS)和Total Tardiness(TT)的平衡优化。利用DRL而不是随机的元启发式操作会导致更好的算法迭代,并且在此类复杂的环境中胜过传统方法。
The following interdisciplinary article presents a memetic algorithm with applying deep reinforcement learning (DRL) for solving practically oriented dual resource constrained flexible job shop scheduling problems (DRC-FJSSP). From research projects in industry, we recognize the need to consider flexible machines, flexible human workers, worker capabilities, setup and processing operations, material arrival times, complex job paths with parallel tasks for bill of material (BOM) manufacturing, sequence-dependent setup times and (partially) automated tasks in human-machine-collaboration. In recent years, there has been extensive research on metaheuristics and DRL techniques but focused on simple scheduling environments. However, there are few approaches combining metaheuristics and DRL to generate schedules more reliably and efficiently. In this paper, we first formulate a DRC-FJSSP to map complex industry requirements beyond traditional job shop models. Then we propose a scheduling framework integrating a discrete event simulation (DES) for schedule evaluation, considering parallel computing and multicriteria optimization. Here, a memetic algorithm is enriched with DRL to improve sequencing and assignment decisions. Through numerical experiments with real-world production data, we confirm that the framework generates feasible schedules efficiently and reliably for a balanced optimization of makespan (MS) and total tardiness (TT). Utilizing DRL instead of random metaheuristic operations leads to better results in fewer algorithm iterations and outperforms traditional approaches in such complex environments.