深神经网络的近似能力：解释性数学调查

论文标题

深神经网络的近似能力：解释性数学调查

Approximation Power of Deep Neural Networks: an explanatory mathematical survey

论文作者

Davis, Owen, Motamed, Mohammad

论文摘要

该调查对深神经网络的近似特性进行了深入而解释性的综述，重点是馈送前进和残留架构。主要目的是检查神经网络如何有效地近似目标功能，并确定其胜过传统近似方法的条件。关键主题包括深网的非线性，组成结构以及神经网络任务的形式化，作为回归和分类设置中的优化问题。该调查还解决了训练过程，强调了随机梯度下降和反向传播在解决这些优化问题中的作用，并突出了实际的考虑因素，例如激活功能，过度拟合和正则化技术。此外，该调查还探讨了连续函数空间中神经网络的密度，将深度Relu网络的近似能力与其他近似方法的近似能力进行了比较。它讨论了在理解这些网络的表现力和局限性方面的最新理论进步。还提供了详细的错误复杂性分析，重点是在有限的目标函数的情况下，具有relu和傅立叶型激活函数的神经网络的错误率和计算复杂性，具有最小的规律性假设。除了最近的已知结果外，该调查还介绍了新的发现，为理解神经网络近似的理论基础提供了宝贵的资源。提供了结论性的评论和进一步的阅读建议。

This survey provides an in-depth and explanatory review of the approximation properties of deep neural networks, with a focus on feed-forward and residual architectures. The primary objective is to examine how effectively neural networks approximate target functions and to identify conditions under which they outperform traditional approximation methods. Key topics include the nonlinear, compositional structure of deep networks and the formalization of neural network tasks as optimization problems in regression and classification settings. The survey also addresses the training process, emphasizing the role of stochastic gradient descent and backpropagation in solving these optimization problems, and highlights practical considerations such as activation functions, overfitting, and regularization techniques. Additionally, the survey explores the density of neural networks in the space of continuous functions, comparing the approximation capabilities of deep ReLU networks with those of other approximation methods. It discusses recent theoretical advancements in understanding the expressiveness and limitations of these networks. A detailed error-complexity analysis is also presented, focusing on error rates and computational complexity for neural networks with ReLU and Fourier-type activation functions in the context of bounded target functions with minimal regularity assumptions. Alongside recent known results, the survey introduces new findings, offering a valuable resource for understanding the theoretical foundations of neural network approximation. Concluding remarks and further reading suggestions are provided.

下载PDF全文

下载文献需遵守相关版权规定

论文标题