分布式用户的基于内核的土匪的协作学习

论文标题

分布式用户的基于内核的土匪的协作学习

Collaborative Learning in Kernel-based Bandits for Distributed Users

论文作者

Salgia, Sudeep, Vakili, Sattar, Zhao, Qing

论文摘要

我们研究中央服务器促进的分布式客户之间的协作学习。每个客户都对最大化的个性化目标函数感兴趣，该目标功能是其本地目标和全球目标的加权总和。每个客户都可以直接访问其本地目标的随机匪徒反馈，但只能对全球目标有部分视图，并依靠与其他客户进行协作学习的信息交流。我们采用基于内核的强盗框架，该框架属于繁殖的内核希尔伯特空间。我们提出了一种基于替代高斯工艺（GP）模型的算法，并建立了其最佳的遗憾性能（归结为多毛因素）。我们还表明，可以使用GP模型的稀疏近似值来减少跨客户的通信开销。

We study collaborative learning among distributed clients facilitated by a central server. Each client is interested in maximizing a personalized objective function that is a weighted sum of its local objective and a global objective. Each client has direct access to random bandit feedback on its local objective, but only has a partial view of the global objective and relies on information exchange with other clients for collaborative learning. We adopt the kernel-based bandit framework where the objective functions belong to a reproducing kernel Hilbert space. We propose an algorithm based on surrogate Gaussian process (GP) models and establish its order-optimal regret performance (up to polylogarithmic factors). We also show that the sparse approximations of the GP models can be employed to reduce the communication overhead across clients.

下载PDF全文

下载文献需遵守相关版权规定

论文标题