论文标题
关于山脊功能和神经网络的注释
Notes on ridge functions and neural networks
论文作者
论文摘要
这些笔记是关于脊功能的。近年来,对这些功能引起了人们的兴趣。山脊功能出现在各个领域和各种姿势下。它们出现在像部分微分方程(称为平面波),计算机断层扫描和统计数据一样多样化的领域。这些功能也是神经网络中许多中心模型的基础。 从近似理论的角度来看,我们对山脊功能感兴趣。近似理论的基本目标是通过简单对象近似复杂的对象。在许多类别的多元函数中,脊函数的线性组合是一类简单的函数。这些注释研究了通过脊函数的线性组合近似多元功能的一些问题。我们在这里介绍这些功能的各种属性。我们问的问题如下。什么时候可以将多元函数表示为某个类别的脊函数的线性组合?这种线性组合何时代表每个多元函数?如果不可能进行精确的表示,那么一个人可以任意地近似吗?如果近似良好失败,那么如何计算/估计近似的误差,知道存在最佳近似值?一个人如何表征和构建最佳近似值?如果平滑函数是任意行为的脊函数的总和,是否可以将其表示为平滑脊函数的总和?我们还研究了广义山脊功能的性能,这些功能与线性叠加和Kolmogorov著名的叠加定理非常相关。这些注释以脊函数的一些应用结束,即单一和两个隐藏层神经网络的近似问题,具有受限的权重。 我们希望这些笔记对研究人员和学生都会有用且有趣。
These notes are about ridge functions. Recent years have witnessed a flurry of interest in these functions. Ridge functions appear in various fields and under various guises. They appear in fields as diverse as partial differential equations (where they are called plane waves), computerized tomography and statistics. These functions are also the underpinnings of many central models in neural networks. We are interested in ridge functions from the point of view of approximation theory. The basic goal in approximation theory is to approximate complicated objects by simpler objects. Among many classes of multivariate functions, linear combinations of ridge functions are a class of simpler functions. These notes study some problems of approximation of multivariate functions by linear combinations of ridge functions. We present here various properties of these functions. The questions we ask are as follows. When can a multivariate function be expressed as a linear combination of ridge functions from a certain class? When do such linear combinations represent each multivariate function? If a precise representation is not possible, can one approximate arbitrarily well? If well approximation fails, how can one compute/estimate the error of approximation, know that a best approximation exists? How can one characterize and construct best approximations? If a smooth function is a sum of arbitrarily behaved ridge functions, can it be expressed as a sum of smooth ridge functions? We also study properties of generalized ridge functions, which are very much related to linear superpositions and Kolmogorov's famous superposition theorem. These notes end with a few applications of ridge functions to the problem of approximation by single and two hidden layer neural networks with a restricted set of weights. We hope that these notes will be useful and interesting to both researchers and students.