使用深网的实时最佳指导和控制星际转移

论文标题

使用深网的实时最佳指导和控制星际转移

Real-Time Optimal Guidance and Control for Interplanetary Transfers Using Deep Networks

论文作者

Izzo, Dario, Öztürk, Ekin

论文摘要

我们考虑了低头式航天器的地球质量 - 质量最佳星际转移，并显示了如何通过在状态空间的大部分空间中的深网和高度准确性来表示最佳指导。模仿（监督）最佳示例的学习用作网络培训范式。所得模型适用于航天器的最佳指导和控制系统的车载，实时实现，称为G＆CNETS。引入了一种新的通用方法，称为最佳示例的向后生成，并证明能够有效地创建训练G＆CNET所需的所有最佳状态动作对，而无需解决最佳控制问题。关于以前的工作，我们能够生产包含一些数量级的数据集，并获得与实际任务要求兼容的网络性能。提出并测试了几种能够训练最佳策略（推力轮廓）或值函数（最佳质量）的表示形式。我们发现，政策学习和价值功能学习成功，准确地学习最佳推力，并且采用学习推力的航天器能够达到目标条件轨道支出仅比相应的数学最佳传输更具推进剂。此外，可以在1％以内的错误中预测最佳推进剂质量（如果进行价值函数学习）。在星际转移的模拟过程中，对所有产生的G＆CNET都进行了测试，从而从标称和新神经的条件开始最佳地达到目标条件的能力。

We consider the Earth-Venus mass-optimal interplanetary transfer of a low-thrust spacecraft and show how the optimal guidance can be represented by deep networks in a large portion of the state space and to a high degree of accuracy. Imitation (supervised) learning of optimal examples is used as a network training paradigm. The resulting models are suitable for an on-board, real-time, implementation of the optimal guidance and control system of the spacecraft and are called G&CNETs. A new general methodology called Backward Generation of Optimal Examples is introduced and shown to be able to efficiently create all the optimal state action pairs necessary to train G&CNETs without solving optimal control problems. With respect to previous works, we are able to produce datasets containing a few orders of magnitude more optimal trajectories and obtain network performances compatible with real missions requirements. Several schemes able to train representations of either the optimal policy (thrust profile) or the value function (optimal mass) are proposed and tested. We find that both policy learning and value function learning successfully and accurately learn the optimal thrust and that a spacecraft employing the learned thrust is able to reach the target conditions orbit spending only 2 permil more propellant than in the corresponding mathematically optimal transfer. Moreover, the optimal propellant mass can be predicted (in case of value function learning) within an error well within 1%. All G&CNETs produced are tested during simulations of interplanetary transfers with respect to their ability to reach the target conditions optimally starting from nominal and off-nominal conditions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题