论文标题
互动式建议,以限制市场的最佳分配
Interactive Recommendations for Optimal Allocations in Markets with Constraints
论文作者
论文摘要
推荐系统在市场中使用时发挥了双重作用:它们可以帮助用户从大型游泳池中选择最需要的物品,并有助于将有限数量的物品分配给最想要它们的用户。尽管在许多现实世界中的推荐设置中,能力限制的流行率普遍存在,但缺乏将它们纳入这些系统设计的原则性方式。在此激励的情况下,我们提出了一个交互式框架,系统提供商可以通过机会主义地探索分配来提高向用户的推荐质量,从而最大程度地利用用户奖励并使用适当的定价机制尊重容量约束。我们将问题建模为低排名组合的多臂强盗问题,并在手臂上的选择约束。我们采用一种集成方法,使用协作过滤,组合匪徒和最佳资源分配的技术,以提供一种算法,可证明可以实现次线性遗憾,即$ \ tilde {\ tilde {\ nathcal {o}}}}(\ sqrt {\ sqrt {n m(n m(n+m)$ n m(n m) $ r $平均奖励矩阵。关于合成和现实世界数据的实证研究也证明了我们方法的有效性和性能。
Recommendation systems when employed in markets play a dual role: they assist users in selecting their most desired items from a large pool and they help in allocating a limited number of items to the users who desire them the most. Despite the prevalence of capacity constraints on allocations in many real-world recommendation settings, a principled way of incorporating them in the design of these systems has been lacking. Motivated by this, we propose an interactive framework where the system provider can enhance the quality of recommendations to the users by opportunistically exploring allocations that maximize user rewards and respect the capacity constraints using appropriate pricing mechanisms. We model the problem as an instance of a low-rank combinatorial multi-armed bandit problem with selection constraints on the arms. We employ an integrated approach using techniques from collaborative filtering, combinatorial bandits, and optimal resource allocation to provide an algorithm that provably achieves sub-linear regret, namely $\tilde{\mathcal{O}} ( \sqrt{N M (N+M) RT} )$ in $T$ rounds for a problem with $N$ users, $M$ items and rank $R$ mean reward matrix. Empirical studies on synthetic and real-world data also demonstrate the effectiveness and performance of our approach.