稀有轨迹采样的增强学习方法

论文标题

稀有轨迹采样的增强学习方法

A reinforcement learning approach to rare trajectory sampling

论文作者

Rose, Dominic C., Mair, Jamie F., Garrahan, Juan P.

论文摘要

通常，在研究非平衡系统时，人们有兴趣分析以非常低概率（所谓的罕见事件）发生的动态行为。实际上，由于罕见事件是非典型的，因此通常很难以统计学意义的方式访问它们。需要的是“制作典型事件”的策略，以便可以按需生成它们。在这里，我们提出了一种自适应构建动力学的通用方法，该动力学有效地采样了非典型事件。我们通过利用加固学习方法（RL）来做到这一点，该方法是指旨在找到最佳行为以最大程度地提高与动态相关的奖励的机器学习技术集。我们考虑了动态轨迹集合的一般视角，从而用集合重新加权来描述罕见事件。通过最大程度地减少重新加权的集合与适当参数的受控动力学之间的距离，我们得出了一组类似于RL的方法，从数值上近似于实现了罕见的感兴趣行为的最佳动力学。作为简单的插图，我们详细考虑了随机助行器的游览问题，对于有限的时间范围的罕见事件而言；以及在无限时间范围内研究粒子在环几何形状中跳跃的粒子跳跃的当前统计数据的问题。我们讨论此处介绍的思想的自然扩展，包括连续时间马尔可夫系统，第一次通过时间问题和非马克维亚动态。

Very often when studying non-equilibrium systems one is interested in analysing dynamical behaviour that occurs with very low probability, so called rare events. In practice, since rare events are by definition atypical, they are often difficult to access in a statistically significant way. What are required are strategies to "make rare events typical" so that they can be generated on demand. Here we present such a general approach to adaptively construct a dynamics that efficiently samples atypical events. We do so by exploiting the methods of reinforcement learning (RL), which refers to the set of machine learning techniques aimed at finding the optimal behaviour to maximise a reward associated with the dynamics. We consider the general perspective of dynamical trajectory ensembles, whereby rare events are described in terms of ensemble reweighting. By minimising the distance between a reweighted ensemble and that of a suitably parametrised controlled dynamics we arrive at a set of methods similar to those of RL to numerically approximate the optimal dynamics that realises the rare behaviour of interest. As simple illustrations we consider in detail the problem of excursions of a random walker, for the case of rare events with a finite time horizon; and the problem of a studying current statistics of a particle hopping in a ring geometry, for the case of an infinite time horizon. We discuss natural extensions of the ideas presented here, including to continuous-time Markov systems, first passage time problems and non-Markovian dynamics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题