论文标题
Koopman自动编码器的特征值初始化和正则化
Eigenvalue initialisation and regularisation for Koopman autoencoders
论文作者
论文摘要
在训练深层模型中,将神经网络的参数矩阵定期化是无处不在的。典型的正则化方法表明使用较小的随机值初始权重,并惩罚权重以促进稀疏性。但是,在某些情况下,这些广泛使用的技术可能效率较低。在这里,我们研究了Koopman自动编码器模型,其中包括编码器,Koopman操作员层和解码器。这些模型已被设计和致力于解决与物理有关的问题,具有可解释的动力学和合并与物理相关的约束的能力。但是,大多数现有工作都采用标准正规化实践。在我们的工作中,我们朝着通过针对物理相关设置量身定制的初始化和罚款方案迈出了一步,以增强Koopman自动编码器。具体而言,我们提出了从特定特征值分布中对初始Koopman运营商进行初始化的初始化方案。此外,我们建议采取“特征”罚款计划,该计划在培训期间对Koopman运营商的特征值处罚。我们在两个合成数据集上证明了这些方案的实用性:驱动的摆和流过气缸;以及两个现实世界中的问题:海面温度和旋风风场。我们在这些数据集上发现,特征和特征素的收敛速率提高了5倍,并且它们将累积的长期预测误差降低了3倍。这样的发现点是将类似的方案纳入其他与物理学相关的深度学习中的归纳偏置的实用性。
Regularising the parameter matrices of neural networks is ubiquitous in training deep models. Typical regularisation approaches suggest initialising weights using small random values, and to penalise weights to promote sparsity. However, these widely used techniques may be less effective in certain scenarios. Here, we study the Koopman autoencoder model which includes an encoder, a Koopman operator layer, and a decoder. These models have been designed and dedicated to tackle physics-related problems with interpretable dynamics and an ability to incorporate physics-related constraints. However, the majority of existing work employs standard regularisation practices. In our work, we take a step toward augmenting Koopman autoencoders with initialisation and penalty schemes tailored for physics-related settings. Specifically, we propose the "eigeninit" initialisation scheme that samples initial Koopman operators from specific eigenvalue distributions. In addition, we suggest the "eigenloss" penalty scheme that penalises the eigenvalues of the Koopman operator during training. We demonstrate the utility of these schemes on two synthetic data sets: a driven pendulum and flow past a cylinder; and two real-world problems: ocean surface temperatures and cyclone wind fields. We find on these datasets that eigenloss and eigeninit improves the convergence rate by up to a factor of 5, and that they reduce the cumulative long-term prediction error by up to a factor of 3. Such a finding points to the utility of incorporating similar schemes as an inductive bias in other physics-related deep learning approaches.