论文标题
医疗数据集共享的数据集蒸馏
Dataset Distillation for Medical Dataset Sharing
论文作者
论文摘要
由于隐私保护问题以及传输和存储许多高分辨率医疗图像的巨大成本,医院之间共享医疗数据集具有挑战性。但是,数据集蒸馏可以合成一个小数据集,以便在其上训练的模型与原始大型数据集实现了可比的性能,这显示了解决现有的医疗共享问题的潜力。因此,本文提出了一种基于数据集蒸馏的新型医学数据集共享方法。 Covid-19胸部X射线图像数据集的实验结果表明,即使使用稀缺的匿名胸部X射线图像,我们的方法也可以达到高检测性能。
Sharing medical datasets between hospitals is challenging because of the privacy-protection problem and the massive cost of transmitting and storing many high-resolution medical images. However, dataset distillation can synthesize a small dataset such that models trained on it achieve comparable performance with the original large dataset, which shows potential for solving the existing medical sharing problems. Hence, this paper proposes a novel dataset distillation-based method for medical dataset sharing. Experimental results on a COVID-19 chest X-ray image dataset show that our method can achieve high detection performance even using scarce anonymized chest X-ray images.