动态资源分配：算法设计原理和可实现的性能的光谱

论文标题

动态资源分配：算法设计原理和可实现的性能的光谱

Dynamic Resource Allocation: Algorithmic Design Principles and Spectrum of Achievable Performances

论文作者

Besbes, Omar, Kanoria, Yash, Kumar, Akshit

论文摘要

动态资源分配问题无处不在，在库存管理，订单履行，在线广告和其他应用程序中产生。我们最初专注于最简单的在线资源分配模型之一：多秘书问题。在多秘书问题中，决策者依次雇用$ t $候选者的$ b $，并绘制候选能力值。从$ [0,1] $上的分销$ f $。首先，我们研究了性能的基本限制，这是所考虑的价值分布的函数。我们以遗憾来量化性能，定义为相对于事后获得的最佳性能的添加剂损失。我们提出了$ω（T^{t^{1/2-1/2（1 +β）}）$的新型基本遗憾下限缩放，用于在其支持方面具有间隙的分布，$β$量化了这些间隙周围类型（值）的质量积累（值）。在对价值分布的特定假设下，这种下限与恒定和对数遗憾的保证在先前的工作中可以实现的对比。其次，我们介绍了一种新颖的算法原理，关于差距（CWG）的保守性，该原则近乎最佳的性能，遗憾地缩放了$ \ tilde {o}（t^{1/2-1/2-1/2-1/2-1/2（1 +β）}）$，用于通过质量积累参数分配的任何分布的分布，该类别由质量累积的累积参数$β$β$β$β$。然后，我们转向在动态资源分配问题上运行CWG原理。我们研究了一种通用和实用算法，使用多个模拟（RAM）反复起作用，该算法模拟了可能的未来，以估计基于后视的价值到GO函数的近似值。我们确定该算法继承了根据资源请求分布（包括我们的基于CWG的算法）量身定制的算法的理论性能保证，并发现它在数值实验中表现优于它们。

Dynamic resource allocation problems are ubiquitous, arising in inventory management, order fulfillment, online advertising, and other applications. We initially focus on one of the simplest models of online resource allocation: the multisecretary problem. In the multisecretary problem, a decision maker sequentially hires up to $B$ out of $T$ candidates, and candidate ability values are drawn i.i.d. from a distribution $F$ on $[0,1]$. First, we investigate fundamental limits on performance as a function of the value distribution under consideration. We quantify performance in terms of regret, defined as the additive loss relative to the best performance achievable in hindsight. We present a novel fundamental regret lower bound scaling of $Ω(T^{1/2 - 1/2(1 + β)})$ for distributions with gaps in their support, with $β$ quantifying the mass accumulation of types (values) around these gaps. This lower bound contrasts with the constant and logarithmic regret guarantees shown to be achievable in prior work, under specific assumptions on the value distribution. Second, we introduce a novel algorithmic principle, Conservativeness with respect to Gaps (CwG), which yields near-optimal performance with regret scaling of $\tilde{O}(T^{1/2 - 1/2(1 + β)})$ for any distribution in a class parameterized by the mass accumulation parameter $β$. We then turn to operationalizing the CwG principle across dynamic resource allocation problems. We study a general and practical algorithm, Repeatedly Act using Multiple Simulations (RAMS), which simulates possible futures to estimate a hindsight-based approximation of the value-to-go function. We establish that this algorithm inherits theoretical performance guarantees of algorithms tailored to the distribution of resource requests, including our CwG-based algorithm, and find that it outperforms them in numerical experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题