FEDXL：可证明的联合学习，以进行深X风险优化

论文标题

FEDXL：可证明的联合学习，以进行深X风险优化

FeDXL: Provable Federated Learning for Deep X-Risk Optimization

论文作者

Guo, Zhishuai, Jin, Rong, Luo, Jiebo, Yang, Tianbao

论文摘要

在本文中，我们解决了一个新颖的联邦学习（FL）问题，以优化一个X风险家庭，该家族不适用现有的FL算法。特别是，目标的形式具有$ \ mathbb e_ {z \ sim s_1} f（\ mathbb e_ {z'\ sim s_2} \ ell（w; z，z'））$，其中两组$ s_1，s_1，s_2，s_2 $在多个机器上分布在多个机器上，$ \ ell（$ \ ell（\ cdot），这是一个配对的数据。对$（z，z'）$和$ f（\ cdot）$可能是非线性的非凸函数。该问题在机器学习中具有重要的应用，例如，具有成对损失的AUROC最大化以及部分AUROC最大化，并具有组成损失。为X风险设计FL算法的挑战在于目标在多台机器上的非解释性以及不同机器之间的相互依赖性。为此，我们提出了一个主动循环分解框架，该框架将梯度的组件与两种类型（即活动部件和被动零件）分解，在该零件中，主动部件取决于用本地模型计算的本地数据，而被动零件取决于基于历史模型和样品进行通信/计算的其他机器。在此框架下，我们基于联合的平均和合并开发了两种可证明的FL算法（FEDXL），分别用于处理线性和非线性$ f $。我们开发了一种新颖的理论分析，以打击被动部分的潜伏期以及本地模型参数与计算局部梯度估计器的相关数据之间的相互依赖性。我们建立了迭代和通信复杂性，并表明使用历史样本和模型来计算被动部件不会降低复杂性。我们对FEDXL进行了深度AUROC和部分AUROC最大化的经验研究，并证明了它们的性能与几个基线相比。

In this paper, we tackle a novel federated learning (FL) problem for optimizing a family of X-risks, to which no existing FL algorithms are applicable. In particular, the objective has the form of $\mathbb E_{z\sim S_1} f(\mathbb E_{z'\sim S_2} \ell(w; z, z'))$, where two sets of data $S_1, S_2$ are distributed over multiple machines, $\ell(\cdot)$ is a pairwise loss that only depends on the prediction outputs of the input data pairs $(z, z')$, and $f(\cdot)$ is possibly a non-linear non-convex function. This problem has important applications in machine learning, e.g., AUROC maximization with a pairwise loss, and partial AUROC maximization with a compositional loss. The challenges for designing an FL algorithm for X-risks lie in the non-decomposability of the objective over multiple machines and the interdependency between different machines. To this end, we propose an active-passive decomposition framework that decouples the gradient's components with two types, namely active parts and passive parts, where the active parts depend on local data that are computed with the local model and the passive parts depend on other machines that are communicated/computed based on historical models and samples. Under this framework, we develop two provable FL algorithms (FeDXL) for handling linear and nonlinear $f$, respectively, based on federated averaging and merging. We develop a novel theoretical analysis to combat the latency of the passive parts and the interdependency between the local model parameters and the involved data for computing local gradient estimators. We establish both iteration and communication complexities and show that using the historical samples and models for computing the passive parts do not degrade the complexities. We conduct empirical studies of FeDXL for deep AUROC and partial AUROC maximization, and demonstrate their performance compared with several baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题