调查用于形状优化轮廓挤压模具形状的增强学习

论文标题

调查用于形状优化轮廓挤压模具形状的增强学习

Investigation of reinforcement learning for shape optimization of profile extrusion dies

论文作者

Fricke, Clemens, Wolff, Daniel, Kemmerling, Marco, Elgeti, Stefanie

论文摘要

轮廓挤出是从熔融聚合物制造塑料曲线的连续生产过程。尤其有趣的是模具的设计，通过将熔体按下以达到所需的形状。然而，由于在模具出口处的不均匀速度分布或挤出物内部的残留应力，制成部分的最终形状通常偏离所需的速度。为了避免这些偏差，可以在计算中对模具的形状进行计算优化，这已经在文献中使用经典优化方法进行了研究。形状优化领域的一种新方法是将增强学习（RL）作为基于学习的优化算法的利用。 RL基于代理与环境的反复试验相互作用。对于每个行动，代理人都会得到奖励并了解随后的环境状态。尽管不一定优于经典，例如基于梯度的或进化，一个单个问题的优化算法，但是当重复类似的优化任务时，RL技术的性能特别好，因为代理商学习了一种更普遍的策略来产生最佳形状，而不是仅仅集中于一个问题。在这项工作中，我们通过将其应用于两个2D测试案例来研究这种方法。流量通道几何形状可以使用所谓的自由形式变形通过RL代理修改，这是一种将计算网格嵌入转换样条中的方法，然后根据控制点位置对其进行操作。特别是，我们调查了利用不同的代理对训练进度的影响，以及在训练过程中利用多种环境来节省时间的潜力。

Profile extrusion is a continuous production process for manufacturing plastic profiles from molten polymer. Especially interesting is the design of the die, through which the melt is pressed to attain the desired shape. However, due to an inhomogeneous velocity distribution at the die exit or residual stresses inside the extrudate, the final shape of the manufactured part often deviates from the desired one. To avoid these deviations, the shape of the die can be computationally optimized, which has already been investigated in the literature using classical optimization approaches. A new approach in the field of shape optimization is the utilization of Reinforcement Learning (RL) as a learning-based optimization algorithm. RL is based on trial-and-error interactions of an agent with an environment. For each action, the agent is rewarded and informed about the subsequent state of the environment. While not necessarily superior to classical, e.g., gradient-based or evolutionary, optimization algorithms for one single problem, RL techniques are expected to perform especially well when similar optimization tasks are repeated since the agent learns a more general strategy for generating optimal shapes instead of concentrating on just one single problem. In this work, we investigate this approach by applying it to two 2D test cases. The flow-channel geometry can be modified by the RL agent using so-called Free-Form Deformation, a method where the computational mesh is embedded into a transformation spline, which is then manipulated based on the control-point positions. In particular, we investigate the impact of utilizing different agents on the training progress and the potential of wall time saving by utilizing multiple environments during training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题