优化数据驱动的深层神经网络和物理知情的优化器

论文标题

优化数据驱动的深层神经网络和物理知情的优化器

Optimizing the optimizer for data driven deep neural networks and physics informed neural networks

论文作者

Taylor, John, Wang, Wenyi, Bala, Biswajit, Bednarz, Tomasz

论文摘要

我们研究了优化器在确定模型拟合对神经网络的质量方面的作用，该网络具有小到中等参数的质量。我们研究了Adam的性能，Adam是一种基于一阶梯度优化的算法，使用自适应动量，Levenberg和Marquardt（LM）算法是第二阶方法，Broyden，Broyden，Fletcher，Goldfarb和Shanno Algorithm（BFGS）是第二阶方法和第二阶方法和LBFGS，lbfgs，低记忆版本，Bfgs bfgs bfgs bfgs bfgs bfgs bfgs。使用这些优化器，我们使用具有几个参数的神经网络拟合函数y = sinc（10x）。此函数具有可变幅度和恒定频率。我们观察到该函数的较高振幅成分首先拟合，而ADAM，BFGS和LBFG则难以适合该功能的较低振幅成分。我们还使用BFG和LM优化器的物理知情神经网络（PINN）来求解汉堡方程。对于我们的示例问题较小至中等重量的问题，我们发现LM算法能够快速收敛到机器精确度，从而提供了比其他优化器的重要好处。我们进一步研究了ADAM优化器，并发现Adam优化器需要更深的模型，其中大量隐藏单元包含多达26倍的参数，以实现LM Optimizer实现的模型接近。 LM优化器结果表明，可能构建参数少得多的构建模型。我们已经在Keras和Tensorflow 2中实现了所有方法。

We investigate the role of the optimizer in determining the quality of the model fit for neural networks with a small to medium number of parameters. We study the performance of Adam, an algorithm for first-order gradient-based optimization that uses adaptive momentum, the Levenberg and Marquardt (LM) algorithm a second order method, Broyden,Fletcher,Goldfarb and Shanno algorithm (BFGS) a second order method and LBFGS, a low memory version of BFGS. Using these optimizers we fit the function y = sinc(10x) using a neural network with a few parameters. This function has a variable amplitude and a constant frequency. We observe that the higher amplitude components of the function are fitted first and the Adam, BFGS and LBFGS struggle to fit the lower amplitude components of the function. We also solve the Burgers equation using a physics informed neural network(PINN) with the BFGS and LM optimizers. For our example problems with a small to medium number of weights, we find that the LM algorithm is able to rapidly converge to machine precision offering significant benefits over other optimizers. We further investigated the Adam optimizer with a range of models and found that Adam optimiser requires much deeper models with large numbers of hidden units containing up to 26x more parameters, in order to achieve a model fit close that achieved by the LM optimizer. The LM optimizer results illustrate that it may be possible build models with far fewer parameters. We have implemented all our methods in Keras and TensorFlow 2.

下载PDF全文

下载文献需遵守相关版权规定

论文标题