解释了混乱的水库计算预报员的惊人成功？通用机器学习动力系统与VAR和DMD对比度

论文标题

解释了混乱的水库计算预报员的惊人成功？通用机器学习动力系统与VAR和DMD对比度

On Explaining the Surprising Success of Reservoir Computing Forecaster of Chaos? The Universal Machine Learning Dynamical System with Contrasts to VAR and DMD

论文作者

Bollt, Erik

论文摘要

机器学习已成为广泛流行且成功的范式，包括数据驱动的科学和工程学。一个主要的应用问题是从复杂的动力学中对未来状态进行数据驱动的预测。在许多机器学习方法中，人工神经网络（ANN）已经发展为明确的领导者，而经常性的神经网络（RNN）被认为特别适合预测动态系统。在这种情况下，ECHO状态网络（ESN）或Reservoir Computer（RC）出现了它们的简单性和计算复杂性优势。 RC不是训练有素的网络，而是通过简单，有效的最小二乘法训练读出的权重。令人惊讶的是，RC仍然成功地进行了高质量的预测，即使不是领导者，也可以通过更加强化的方法进行竞争。尽管随机选择了权重，但仍有关于RC为什么以及如何工作的问题。我们将RC通过线性激活和线性读出将RC明确连接到有关矢量自回旋平均值（VAR）的良好发达的时间序列文献，其中包括通过Wold定理的代表性定理，该定理已经对短期预测进行了合理的性能。在线性激活和现在流行的二次读出RC的情况下，我们明确连接到非线性var（NVAR），其性能很好。此外，我们将此范式关联到现在广泛流行的动态模式分解（DMD），因此这三个在某种意义上是同一事物的不同面孔。我们用流行的基准示例（包括Mackey-Glass微分延迟方程和Lorenz63系统）来说明我们的观察结果。

Machine learning has become a widely popular and successful paradigm, including in data-driven science and engineering. A major application problem is data-driven forecasting of future states from a complex dynamical. Artificial neural networks (ANN) have evolved as a clear leader amongst many machine learning approaches, and recurrent neural networks (RNN) are considered to be especially well suited for forecasting dynamical systems. In this setting, the echo state networks (ESN) or reservoir computer (RC) have emerged for their simplicity and computational complexity advantages. Instead of a fully trained network, an RC trains only read-out weights by a simple, efficient least squares method. What is perhaps quite surprising is that nonetheless an RC succeeds to make high quality forecasts, competitively with more intensively trained methods, even if not the leader. There remains an unanswered question as to why and how an RC works at all, despite randomly selected weights. We explicitly connect the RC with linear activation and linear read-out to well developed time-series literature on vector autoregressive averages (VAR) that includes theorems on representability through the WOLD theorem, which already perform reasonably for short term forecasts. In the case of a linear activation and now popular quadratic read-out RC, we explicitly connect to a nonlinear VAR (NVAR), which performs quite well. Further, we associate this paradigm to the now widely popular dynamic mode decomposition (DMD), and thus these three are in a sense different faces of the same thing. We illustrate our observations in terms of popular benchmark examples including Mackey-Glass differential delay equations and the Lorenz63 system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题