论文标题
分布式用户的基于内核的土匪的协作学习
Collaborative Learning in Kernel-based Bandits for Distributed Users
论文作者
论文摘要
我们研究中央服务器促进的分布式客户之间的协作学习。每个客户都对最大化的个性化目标函数感兴趣,该目标功能是其本地目标和全球目标的加权总和。每个客户都可以直接访问其本地目标的随机匪徒反馈,但只能对全球目标有部分视图,并依靠与其他客户进行协作学习的信息交流。我们采用基于内核的强盗框架,该框架属于繁殖的内核希尔伯特空间。我们提出了一种基于替代高斯工艺(GP)模型的算法,并建立了其最佳的遗憾性能(归结为多毛因素)。我们还表明,可以使用GP模型的稀疏近似值来减少跨客户的通信开销。
We study collaborative learning among distributed clients facilitated by a central server. Each client is interested in maximizing a personalized objective function that is a weighted sum of its local objective and a global objective. Each client has direct access to random bandit feedback on its local objective, but only has a partial view of the global objective and relies on information exchange with other clients for collaborative learning. We adopt the kernel-based bandit framework where the objective functions belong to a reproducing kernel Hilbert space. We propose an algorithm based on surrogate Gaussian process (GP) models and establish its order-optimal regret performance (up to polylogarithmic factors). We also show that the sparse approximations of the GP models can be employed to reduce the communication overhead across clients.