论文标题
在高维度中连续函数的神经网络近似与相反问题的应用
Neural Network Approximation of Continuous Functions in High Dimensions with Applications to Inverse Problems
论文作者
论文摘要
在过去的十年中,神经网络在各种各样的反问题中取得的显着成功促进了他们在从医学成像到地震分析等学科中的采用。但是,这种反问题的高维度同时使当前理论预测,网络应在问题的维度上成倍扩展,无法解释为什么在这些设置中使用的看似很小的网络在实践中也可以正常工作。为了减少理论和实践之间的差距,我们提供了一种通用方法,以界定神经网络所需的复杂性,以近似于具有低复杂性结构的高维集合中定义的Hölder(或统一)连续函数。该方法基于这样的观察,即约翰逊·林斯特劳斯(Johnson-Lindenstrauss $ f:s \ to \ mathbb {r}^p $,存在Hölder(或均匀)连续函数$ g:[ - m,m,m]^d \ to \ mathbb {r}^p $,使得$ g(ax ax)= f(x)= f(x)$ in s $ in s $ in s $。因此,如果一个人具有近似$ g的神经网络:[ - m,m]^d \ to \ mathbb {r}^p $,则可以添加一个图层,以实现jl嵌入$ a $以获取近似于$ f:s $ f:s \ to \ mathbb {r}^p $的神经网络。通过将JL嵌入结果配对,并通过神经网络对Hölder(或统一)连续功能的近似结果进行配对,然后获得了一个结果,即神经网络在高二光集上近似于Hölder(或均匀)连续功能所需的复杂性。最终结果是一个一般的理论框架,然后可以用它来更好地解释比当前理论所允许的多种多样的逆问题中观察到的较小网络的经验成功。
The remarkable successes of neural networks in a huge variety of inverse problems have fueled their adoption in disciplines ranging from medical imaging to seismic analysis over the past decade. However, the high dimensionality of such inverse problems has simultaneously left current theory, which predicts that networks should scale exponentially in the dimension of the problem, unable to explain why the seemingly small networks used in these settings work as well as they do in practice. To reduce this gap between theory and practice, we provide a general method for bounding the complexity required for a neural network to approximate a Hölder (or uniformly) continuous function defined on a high-dimensional set with a low-complexity structure. The approach is based on the observation that the existence of a Johnson-Lindenstrauss embedding $A\in\mathbb{R}^{d\times D}$ of a given high-dimensional set $S\subset\mathbb{R}^D$ into a low dimensional cube $[-M,M]^d$ implies that for any Hölder (or uniformly) continuous function $f:S\to\mathbb{R}^p$, there exists a Hölder (or uniformly) continuous function $g:[-M,M]^d\to\mathbb{R}^p$ such that $g(Ax)=f(x)$ for all $x\in S$. Hence, if one has a neural network which approximates $g:[-M,M]^d\to\mathbb{R}^p$, then a layer can be added that implements the JL embedding $A$ to obtain a neural network that approximates $f:S\to\mathbb{R}^p$. By pairing JL embedding results along with results on approximation of Hölder (or uniformly) continuous functions by neural networks, one then obtains results which bound the complexity required for a neural network to approximate Hölder (or uniformly) continuous functions on high dimensional sets. The end result is a general theoretical framework which can then be used to better explain the observed empirical successes of smaller networks in a wider variety of inverse problems than current theory allows.