论文标题
观察约束随机数据驱动模型的长期稳定性和地球物理湍流模型的概括
Long-term stability and generalization of observationally-constrained stochastic data-driven models for geophysical turbulence
论文作者
论文摘要
近年来,人们对建立基于深度学习的完全数据驱动的模型的兴趣激增。这样的深度学习模型,如果经过观察培训可以减轻当前最新天气模型中的某些偏见,其中一些源于亚网格规模过程的不准确表示。但是,这些数据驱动的模型被过度参数化,需要大量的培训数据,这些数据可能无法从重新分析(观察数据)产品中获得。此外,在现实情况下,没有数据驱动的天气模型开始预测的准确,无噪声,初始条件。最后,确定性数据驱动的预测模型遭受了长期稳定性和非物理气候漂移的问题,这使得这些数据驱动的模型不适合计算气候统计。鉴于这些挑战,先前的研究试图在大量不完美的长期气候模型模拟上预先培训深度学习的天气预报模型,然后对可用的观察数据进行重新培训。在本文中,我们提出了一个基于卷积的变异自动编码器随机数据驱动的模型,该模型已在不完美的气候模型仿真中预先训练,从2层的准地球体流动流中进行了不完美的气候模型仿真,并使用转移学习对少数噪声的观察结果进行了转移学习,从而从完美的模拟中进行了转移学习。然后,该重新训练的模型进行随机预测,并从完美的模拟中采样了嘈杂的初始条件。我们表明,基于整体的随机数据驱动的模型在短期技能方面优于基线确定性编码模型的基线确定性卷积模型,同时保持稳定的长期气候模拟,可产生准确的气候学。
Recent years have seen a surge in interest in building deep learning-based fully data-driven models for weather prediction. Such deep learning models if trained on observations can mitigate certain biases in current state-of-the-art weather models, some of which stem from inaccurate representation of subgrid-scale processes. However, these data-driven models, being over-parameterized, require a lot of training data which may not be available from reanalysis (observational data) products. Moreover, an accurate, noise-free, initial condition to start forecasting with a data-driven weather model is not available in realistic scenarios. Finally, deterministic data-driven forecasting models suffer from issues with long-term stability and unphysical climate drift, which makes these data-driven models unsuitable for computing climate statistics. Given these challenges, previous studies have tried to pre-train deep learning-based weather forecasting models on a large amount of imperfect long-term climate model simulations and then re-train them on available observational data. In this paper, we propose a convolutional variational autoencoder-based stochastic data-driven model that is pre-trained on an imperfect climate model simulation from a 2-layer quasi-geostrophic flow and re-trained, using transfer learning, on a small number of noisy observations from a perfect simulation. This re-trained model then performs stochastic forecasting with a noisy initial condition sampled from the perfect simulation. We show that our ensemble-based stochastic data-driven model outperforms a baseline deterministic encoder-decoder-based convolutional model in terms of short-term skills while remaining stable for long-term climate simulations yielding accurate climatology.