论文标题
不同措施之间的经验最佳运输适应较低的复杂性
Empirical Optimal Transport between Different Measures Adapts to Lower Complexity
论文作者
论文摘要
从随机数据的两个概率度量之间的经验最佳运输(OT)成本是基于运输数据分析的基本数量。在这项工作中,当涉及的度量不同时,我们为其收敛率提供了新颖的保证,可能在不同的空间上得到支持。我们的主要观察结果是,经验成本的统计性能取决于较不复杂的度量,这是一种现象,我们称为经验ot的较低复杂性适应性。例如,在Lipschitz的地面成本下,我们发现基于$ n $观测值的经验成本至少与$ n^{ - 1/d} $的费率收敛于人口数量,如果两种措施之一集中在$ d $ dublemensional歧管上,而另一种措施则可以是任意的。对于半圆形地面成本,我们表明,费率上的上限提高到$ n^{ - 2/d} $。同样,我们的理论确定了半散旧OT的一般收敛速率$ n^{ - 1/2} $。所有这些结果在两个样本的情况下也是有效的,这意味着收敛速率仍然受两种措施的简单性。因此,从概念上讲,我们的发现表明,维度的诅咒仅在两种措施表现出高内在维度时才会影响OT成本的估计。我们的证明是基于OT的双重配方作为对合适的功能类别$ \ MATHCAL {F} _C $的最大化,并观察到,$ \ Mathcal {f} _C $的$ C $ - 转换为有限成本(限制性成本)具有与$ \ Mathcal {f} _c $相同的均匀度量输入。
The empirical optimal transport (OT) cost between two probability measures from random data is a fundamental quantity in transport based data analysis. In this work, we derive novel guarantees for its convergence rate when the involved measures are different, possibly supported on different spaces. Our central observation is that the statistical performance of the empirical OT cost is determined by the less complex measure, a phenomenon we refer to as lower complexity adaptation of empirical OT. For instance, under Lipschitz ground costs, we find that the empirical OT cost based on $n$ observations converges at least with rate $n^{-1/d}$ to the population quantity if one of the two measures is concentrated on a $d$-dimensional manifold, while the other can be arbitrary. For semi-concave ground costs, we show that the upper bound for the rate improves to $n^{-2/d}$. Similarly, our theory establishes the general convergence rate $n^{-1/2}$ for semi-discrete OT. All of these results are valid in the two-sample case as well, meaning that the convergence rate is still governed by the simpler of the two measures. On a conceptual level, our findings therefore suggest that the curse of dimensionality only affects the estimation of the OT cost when both measures exhibit a high intrinsic dimension. Our proofs are based on the dual formulation of OT as a maximization over a suitable function class $\mathcal{F}_c$ and the observation that the $c$-transform of $\mathcal{F}_c$ under bounded costs has the same uniform metric entropy as $\mathcal{F}_c$ itself.