在多个约束下的下块状匪徒问题

论文标题

在多个约束下的下块状匪徒问题

Submodular Bandit Problem Under Multiple Constraints

论文作者

Takemori, Sho, Sato, Masahiro, Sonoda, Takashi, Singh, Janmajay, Ohkuma, Tomoko

论文摘要

提出了线性的下块匪徒问题，以同时解决推荐系统中多元化的检索和在线学习。如果没有不确定性，则此问题等同于基数约束下的supporular最大化问题。但是，在某些情况下，建议列表应满足其他限制，例如预算限制，而不是基数约束。因此，在考虑到预算限制的情况下，以多样化的检索为动机，我们在$ l $ swapsacks和$ k $ - 系统约束的交集下引入了一个凸出的强盗问题。在这里，$ k $ - 系统约束形成了非常一般的约束类别，包括基数约束和$ k $ attroid约束的交集。为了解决这个问题，我们提出了一种非蛋白质算法，该算法专注于标准或修改的上立界限。我们提供了近似遗憾的高概率上限，其中近似比与快速离线算法相匹配。此外，我们使用合成和两个现实世界数据集在约束的各种组合下进行实验，并证明我们所提出的方法的表现优于现有基线。

The linear submodular bandit problem was proposed to simultaneously address diversified retrieval and online learning in a recommender system. If there is no uncertainty, this problem is equivalent to a submodular maximization problem under a cardinality constraint. However, in some situations, recommendation lists should satisfy additional constraints such as budget constraints, other than a cardinality constraint. Thus, motivated by diversified retrieval considering budget constraints, we introduce a submodular bandit problem under the intersection of $l$ knapsacks and a $k$-system constraint. Here $k$-system constraints form a very general class of constraints including cardinality constraints and the intersection of $k$ matroid constraints. To solve this problem, we propose a non-greedy algorithm that adaptively focuses on a standard or modified upper-confidence bound. We provide a high-probability upper bound of an approximation regret, where the approximation ratio matches that of a fast offline algorithm. Moreover, we perform experiments under various combinations of constraints using a synthetic and two real-world datasets and demonstrate that our proposed methods outperform the existing baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题