融入深度神经网络

论文标题

融入深度神经网络

Imbedding Deep Neural Networks

论文作者

Corbett, Andrew, Kangin, Dmitry

论文摘要

连续深入的神经网络（例如神经ODES）已根据非线性矢量值的最佳控制问题重塑了对残留神经网络的理解。常见的解决方案是使用伴随灵敏度方法来复制前回传递优化问题。我们提出了一种新方法，该方法将网络的“深度”阐明为基本变量，从而将问题降低到面向前面的初始价值问题的系统。这种新方法基于“不变嵌入”的原理，我们证明了一个通用解决方案，适用于所有非线性，矢量值值的最佳控制问题，均与运行和终端损失。我们的新体系结构为网络深度提供了理论（并在很大程度上无法解释）的理论方面提供了一个切实的工具。它们还构成了与嵌入的残留神经网络类别相当的神经ODE的离散实施资源。通过一系列实验，我们展示了针对监督学习和时间序列预测的拟议体系结构的竞争性能。

Continuous-depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems. The common solution is to use the adjoint sensitivity method to replicate a forward-backward pass optimisation problem. We propose a new approach which explicates the network's `depth' as a fundamental variable, thus reducing the problem to a system of forward-facing initial value problems. This new method is based on the principle of `Invariant Imbedding' for which we prove a general solution, applicable to all non-linear, vector-valued optimal control problems with both running and terminal loss. Our new architectures provide a tangible tool for inspecting the theoretical--and to a great extent unexplained--properties of network depth. They also constitute a resource of discrete implementations of Neural ODEs comparable to classes of imbedded residual neural networks. Through a series of experiments, we show the competitive performance of the proposed architectures for supervised learning and time series prediction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题