多保真增强学习框架的形状优化框架

论文标题

多保真增强学习框架的形状优化框架

Multi-fidelity reinforcement learning framework for shape optimization

论文作者

Bhola, Sahil, Pawar, Suraj, Balaprakash, Prasanna, Maulik, Romit

论文摘要

深度强化学习（DRL）是一个有希望的外环智能范式，可以为复杂的任务部署问题解决策略。因此，DRL已用于多种科学应用，特别是在经典优化或控制方法有限的情况下。常规DRL方法的一个关键局限性是它们的渴望发作性质，这被证明是涉及数值前向模型的昂贵评估的任务的瓶颈。在本文中，我们通过引入一个受控的传输学习框架来解决DRL的这种限制，该框架利用多保真模拟设置。我们的策略是在高雷诺数字上的机翼形状优化问题部署的，我们的框架可以通过从多效率环境中收集知识并将计算成本降低30 \％来学习最佳政策，以生成有效的机翼形状。此外，我们的公式促进了对新环境的政策探索和概括，从而防止了仅仅一个保真度与数据过度合适。我们的结果表明，该框架适用于其他科学的DRL方案，在该场景中，可以将多保真环境用于政策学习。

Deep reinforcement learning (DRL) is a promising outer-loop intelligence paradigm which can deploy problem solving strategies for complex tasks. Consequently, DRL has been utilized for several scientific applications, specifically in cases where classical optimization or control methods are limited. One key limitation of conventional DRL methods is their episode-hungry nature which proves to be a bottleneck for tasks which involve costly evaluations of a numerical forward model. In this article, we address this limitation of DRL by introducing a controlled transfer learning framework that leverages a multi-fidelity simulation setting. Our strategy is deployed for an airfoil shape optimization problem at high Reynolds numbers, where our framework can learn an optimal policy for generating efficient airfoil shapes by gathering knowledge from multi-fidelity environments and reduces computational costs by over 30\%. Furthermore, our formulation promotes policy exploration and generalization to new environments, thereby preventing over-fitting to data from solely one fidelity. Our results demonstrate this framework's applicability to other scientific DRL scenarios where multi-fidelity environments can be used for policy learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题