论文标题
$ l^p $ norm中的一般近似下限,并应用于前馈神经网络
A general approximation lower bound in $L^p$ norm, with applications to feed-forward neural networks
论文作者
论文摘要
我们研究神经网络表达能力的基本限制。给定两组$ f $,$ g $的实值函数,我们首先证明$ f $中的功能的一般下限可以在$ l^p(μ)$ norm中通过$ g $中的功能近似,对于任何$ p \ geq 1 $和任何概率衡量标准$μ$。下限取决于$ f $的包装数,$ f $的范围以及$ g $的脂肪震撼尺寸。然后,我们将此实例化与$ G $相对应与分段的馈电神经网络相对应的情况,并详细描述了两组$ f $:h {Ö} lder balls和多变量单调函数的应用程序。除了匹配(已知或新的)上限与对数因子的上限外,我们的下限还阐明了$ l^p $ Norm或SUP Norm中的近似值之间的相似性或差异,从而解决了Devore等人的开放问题。 (2021)。我们的证明策略与SUP Norm案例不同,并使用了Mendelson(2002)的关键概率结果。
We study the fundamental limits to the expressive power of neural networks. Given two sets $F$, $G$ of real-valued functions, we first prove a general lower bound on how well functions in $F$ can be approximated in $L^p(μ)$ norm by functions in $G$, for any $p \geq 1$ and any probability measure $μ$. The lower bound depends on the packing number of $F$, the range of $F$, and the fat-shattering dimension of $G$. We then instantiate this bound to the case where $G$ corresponds to a piecewise-polynomial feed-forward neural network, and describe in details the application to two sets $F$: H{ö}lder balls and multivariate monotonic functions. Beside matching (known or new) upper bounds up to log factors, our lower bounds shed some light on the similarities or differences between approximation in $L^p$ norm or in sup norm, solving an open question by DeVore et al. (2021). Our proof strategy differs from the sup norm case and uses a key probability result of Mendelson (2002).