论文标题

在随机土匪中找到所有ε好的手臂

Finding All ε-Good Arms in Stochastic Bandits

论文作者

Mason, Blake, Jain, Lalit, Tripathy, Ardhendu, Nowak, Robert

论文摘要

随机多武器匪徒中的纯探索问题旨在找到一个或多个手臂最大(或几乎最大)的手臂。例子包括找到ε良好的臂,最佳臂识别,TOP-K臂识别,以及在指定阈值以上的平均值中找到所有手臂。但是,在过去的工作中找到所有ε良好臂的问题已经忽略了,尽管可以说这可能是许多应用中最自然的目标。例如,病毒学家可能会对大型候选治疗组进行初步实验室实验,并将所有ε良好治疗方法转移到更昂贵的临床试验中。由于最终的临床功效尚不确定,因此必须识别所有ε良好候选者很重要。从数学上讲,全ε良好的手臂识别问题提出了过去研究的纯探索目标中没有出现的重大新挑战和惊喜。我们介绍了两种算法来克服这些算法,并在纽约客标题比赛中收集的220万个评级以及数据集测试数百种可能的癌症药物的数据集中,证明了它们的出色经验表现。

The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means. Examples include finding an ε-good arm, best-arm identification, top-k arm identification, and finding all arms with means above a specified threshold. However, the problem of finding all ε-good arms has been overlooked in past work, although arguably this may be the most natural objective in many applications. For example, a virologist may conduct preliminary laboratory experiments on a large candidate set of treatments and move all ε-good treatments into more expensive clinical trials. Since the ultimate clinical efficacy is uncertain, it is important to identify all ε-good candidates. Mathematically, the all-ε-good arm identification problem presents significant new challenges and surprises that do not arise in the pure-exploration objectives studied in the past. We introduce two algorithms to overcome these and demonstrate their great empirical performance on a large-scale crowd-sourced dataset of 2.2M ratings collected by the New Yorker Caption Contest as well as a dataset testing hundreds of possible cancer drugs.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源