闭环深度学习：通过后传播产生前向模型

论文标题

闭环深度学习：通过后传播产生前向模型

Closed-loop deep learning: generating forward models with back-propagation

论文作者

Daryanavard, Sama, Porr, Bernd

论文摘要

反射是一种简单的闭环控制方法，它试图最大程度地减少错误，但没有这样做，因为它总是会反应太晚。自适应算法可以在预测提示的帮助下使用此错误来学习正向模型。例如，驾驶员通过寻找避免在最后一分钟避免转向的方向而学会改善转向。为了处理复杂的提示，例如前进的道路是自然的选择。但是，通常只通过使用具有离散状态空间的深度加强学习来间接实现。在这里，我们通过将深度学习嵌入封闭的循环系统并保留其连续处理来展示如何直接实现这一目标。我们特别展示了如何在z空间中以及通常如何在这种封闭环方案中分析基于梯度的方法。在模拟和真正的机器人中，使用线路游行者证明了这种学习范式的性能，该机器人表现出非常快速且连续学习。

A reflex is a simple closed loop control approach which tries to minimise an error but fails to do so because it will always react too late. An adaptive algorithm can use this error to learn a forward model with the help of predictive cues. For example a driver learns to improve their steering by looking ahead to avoid steering in the last minute. In order to process complex cues such as the road ahead deep learning is a natural choice. However, this is usually only achieved indirectly by employing deep reinforcement learning having a discrete state space. Here, we show how this can be directly achieved by embedding deep learning into a closed loop system and preserving its continuous processing. We show specifically how error back-propagation can be achieved in z-space and in general how gradient based approaches can be analysed in such closed loop scenarios. The performance of this learning paradigm is demonstrated using a line-follower both in simulation and on a real robot that show very fast and continuous learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题