论文标题
在线学习通过最少的选择原则运输
Online Learning to Transport via the Minimal Selection Principle
论文作者
论文摘要
在运营研究中的强大动态资源分配中,我们研究了\ textit {在线学习到运输}(OLT)问题,而决策变量是概率措施,一个无限维度的对象。我们通过\ citet {ambrosio_2005}在Wasserstein梯度流中最初研究的洞察力,在线学习,最佳传输和部分微分方程之间进行连接。这使我们可以将标准的在线学习框架扩展到无限维度的无缝设置。基于我们的框架,我们得出了一种新的方法,称为\ textit {最小选择或探索(MSOE)算法},以使用均值场近似和离散化技术来解决OLT问题。在位移凸设置中,基于我们方法的主要理论信息是,随着时间的推移,最小化运输成本(通过最小选择原理)确保了最佳的累积遗憾上限。在算法方面,我们的MSOE算法适用于位移凸设置以外,使最佳传输的数学理论实际上与动态资源分配中常见的非convex设置相关。
Motivated by robust dynamic resource allocation in operations research, we study the \textit{Online Learning to Transport} (OLT) problem where the decision variable is a probability measure, an infinite-dimensional object. We draw connections between online learning, optimal transport, and partial differential equations through an insight called the minimal selection principle, originally studied in the Wasserstein gradient flow setting by \citet{Ambrosio_2005}. This allows us to extend the standard online learning framework to the infinite-dimensional setting seamlessly. Based on our framework, we derive a novel method called the \textit{minimal selection or exploration (MSoE) algorithm} to solve OLT problems using mean-field approximation and discretization techniques. In the displacement convex setting, the main theoretical message underpinning our approach is that minimizing transport cost over time (via the minimal selection principle) ensures optimal cumulative regret upper bounds. On the algorithmic side, our MSoE algorithm applies beyond the displacement convex setting, making the mathematical theory of optimal transport practically relevant to non-convex settings common in dynamic resource allocation.