论文标题
研究与间歇性客户进行医学成像的联合学习的性能和可伸缩性
Study of the performance and scalability of federated learning for medical imaging with intermittent clients
论文作者
论文摘要
联合学习是一种数据分散的隐私技术,用于以安全的方式执行机器或深度学习。在本文中,我们介绍了有关联合学习的理论方面,例如介绍聚合操作员,不同类型的联合学习,以及与客户从客户中分配数据有关的问题,以及对客户数量各不相同的用例的详尽分析。具体而言,使用从开放数据存储库中获得的胸部X射线图像提出了医学图像分析的用例。除了与隐私相关的优势外,还将研究预测(在曲线下的准确性,损失和面积)和减少执行时间(集中式方法)。将从培训数据中模拟不同的客户,以不平衡的方式选择。考虑三个或十个客户之间的结果与集中式案件进行了比较。讨论了与间歇客户有关的两个不同的问题,以及每种方法都遵循两种方法。具体而言,可能会发生这种类型的问题,因为在实际情况下,有些客户可能会离开培训,而其他客户则输入了培训,另一方面由于客户的技术或连接问题。最后,提出了该领域的改进和未来的工作。
Federated learning is a data decentralization privacy-preserving technique used to perform machine or deep learning in a secure way. In this paper we present theoretical aspects about federated learning, such as the presentation of an aggregation operator, different types of federated learning, and issues to be taken into account in relation to the distribution of data from the clients, together with the exhaustive analysis of a use case where the number of clients varies. Specifically, a use case of medical image analysis is proposed, using chest X-Ray images obtained from an open data repository. In addition to the advantages related to privacy, improvements in predictions (in terms of accuracy, loss and area under the curve) and reduction of execution times will be studied with respect to the classical case (the centralized approach). Different clients will be simulated from the training data, selected in an unbalanced manner. The results of considering three or ten clients are exposed and compared between them and against the centralized case. Two different problems related to intermittent clients are discussed, together with two approaches to be followed for each of them. Specifically, this type of problems may occur because in a real scenario some clients may leave the training, and others enter it, and on the other hand because of client technical or connectivity problems. Finally, improvements and future work in the field are proposed.