光谱匪徒的最佳手臂识别

论文标题

光谱匪徒的最佳手臂识别

Best Arm Identification in Spectral Bandits

论文作者

Kocák, Tomáš, Garivier, Aurélien

论文摘要

我们在具有图平滑度约束的强盗模型中以固定的置信度研究了最佳臂识别。我们提供并分析有效的梯度上升算法，以计算该问题的样本复杂性，以解决非平滑最大最大最大问题问题的解决方案（提供对无约束情况的简化分析）。在此算法的基础上，我们提出了一个渐近最佳的策略。此外，我们通过数值实验来说明策略的效率和平滑度约束对样品复杂性的影响。在从参数调整到临床试验的许多应用中，最佳手臂识别（BAI）是重要的挑战。现在，它在香草匪徒模型中已被很好地理解，但是现实世界中的问题通常涉及需要更多涉及模型的武器之间的一些依赖。假设手臂上的图形结构是一种涵盖这一现象的优雅实用方法，但是到目前为止，这仅仅是为了遗憾的最小化。用图形约束来解决BAI涉及本文提供解决方案的微妙优化问题。

We study best-arm identification with fixed confidence in bandit models with graph smoothness constraint. We provide and analyze an efficient gradient ascent algorithm to compute the sample complexity of this problem as a solution of a non-smooth max-min problem (providing in passing a simplified analysis for the unconstrained case). Building on this algorithm, we propose an asymptotically optimal strategy. We furthermore illustrate by numerical experiments both the strategy's efficiency and the impact of the smoothness constraint on the sample complexity. Best Arm Identification (BAI) is an important challenge in many applications ranging from parameter tuning to clinical trials. It is now very well understood in vanilla bandit models, but real-world problems typically involve some dependency between arms that requires more involved models. Assuming a graph structure on the arms is an elegant practical way to encompass this phenomenon, but this had been done so far only for regret minimization. Addressing BAI with graph constraints involves delicate optimization problems for which the present paper offers a solution.

下载PDF全文

下载文献需遵守相关版权规定

论文标题