论文标题
在多个约束下的下块状匪徒问题
Submodular Bandit Problem Under Multiple Constraints
论文作者
论文摘要
提出了线性的下块匪徒问题,以同时解决推荐系统中多元化的检索和在线学习。如果没有不确定性,则此问题等同于基数约束下的supporular最大化问题。但是,在某些情况下,建议列表应满足其他限制,例如预算限制,而不是基数约束。因此,在考虑到预算限制的情况下,以多样化的检索为动机,我们在$ l $ swapsacks和$ k $ - 系统约束的交集下引入了一个凸出的强盗问题。在这里,$ k $ - 系统约束形成了非常一般的约束类别,包括基数约束和$ k $ attroid约束的交集。为了解决这个问题,我们提出了一种非蛋白质算法,该算法专注于标准或修改的上立界限。我们提供了近似遗憾的高概率上限,其中近似比与快速离线算法相匹配。此外,我们使用合成和两个现实世界数据集在约束的各种组合下进行实验,并证明我们所提出的方法的表现优于现有基线。
The linear submodular bandit problem was proposed to simultaneously address diversified retrieval and online learning in a recommender system. If there is no uncertainty, this problem is equivalent to a submodular maximization problem under a cardinality constraint. However, in some situations, recommendation lists should satisfy additional constraints such as budget constraints, other than a cardinality constraint. Thus, motivated by diversified retrieval considering budget constraints, we introduce a submodular bandit problem under the intersection of $l$ knapsacks and a $k$-system constraint. Here $k$-system constraints form a very general class of constraints including cardinality constraints and the intersection of $k$ matroid constraints. To solve this problem, we propose a non-greedy algorithm that adaptively focuses on a standard or modified upper-confidence bound. We provide a high-probability upper bound of an approximation regret, where the approximation ratio matches that of a fast offline algorithm. Moreover, we perform experiments under various combinations of constraints using a synthetic and two real-world datasets and demonstrate that our proposed methods outperform the existing baselines.