通过歧管上过滤的高中断分布式学习

论文标题

通过歧管上过滤的高中断分布式学习

Distributed Learning via Filtered Hyperinterpolation on Manifolds

论文作者

Montúfar, Guido, Wang, Yu Guang

论文摘要

关于流形的数据映射是当代机器学习中的一个重要主题，并在天体物理学，地球物理学，统计物理学，医学诊断，生物化学，3D对象分析中应用。本文研究了通过过滤的输入输出数据对的过滤对流形的实现功能的问题，在这些问题上可以确定性地或随机对输入进行采样，并且输出可能是干净或嘈杂的。由于处理大型数据集的问题，它提出了一种并行的数据处理方法，该方法将数据拟合任务分布在多个服务器之间，并将拟合的子模型合成为全局估计器。我们证明了在整个歧管中学习函数的近似质量，目标函数的类型，服务器数量以及可用样本的数量和类型之间的定量关系。我们获得了分布式和非分布方法的收敛近似值。对于非分布情况，近似顺序是最佳的。

Learning mappings of data on manifolds is an important topic in contemporary machine learning, with applications in astrophysics, geophysics, statistical physics, medical diagnosis, biochemistry, 3D object analysis. This paper studies the problem of learning real-valued functions on manifolds through filtered hyperinterpolation of input-output data pairs where the inputs may be sampled deterministically or at random and the outputs may be clean or noisy. Motivated by the problem of handling large data sets, it presents a parallel data processing approach which distributes the data-fitting task among multiple servers and synthesizes the fitted sub-models into a global estimator. We prove quantitative relations between the approximation quality of the learned function over the entire manifold, the type of target function, the number of servers, and the number and type of available samples. We obtain the approximation rates of convergence for distributed and non-distributed approaches. For the non-distributed case, the approximation order is optimal.

下载PDF全文

下载文献需遵守相关版权规定

论文标题