covid-ct-dataset：关于Covid-19的CT扫描数据集

论文标题

covid-ct-dataset：关于Covid-19的CT扫描数据集

COVID-CT-Dataset: A CT Scan Dataset about COVID-19

论文作者

Yang, Xingyi, He, Xuehai, Zhao, Jinyu, Zhang, Yichen, Zhang, Shanghang, Xie, Pengtao

论文摘要

在COVID-19的暴发期间，计算机断层扫描（CT）是诊断Covid-19患者的有用方式。由于隐私问题，很难获得公开可用的COVID-19 CT数据集，这阻碍了基于CTS的COVID-19的AI驱动诊断方法的研究和开发。为了解决这个问题，我们构建了一个开源数据集-Covid-CT，其中包含来自216名患者的349 COVID-19 CT图像和463个非covid-19 CTS。该数据集的效用是由一位高级放射科医生确认，该医生自这场大流行爆发以来一直在诊断和治疗Covid-19患者。我们还进行了实验研究，进一步证明该数据集可用于开发基于AI的COVID-19的诊断模型。使用此数据集，我们基于多任务学习和自我监督学习开发诊断方法，其F1为0.90，AUC为0.98，精度为0.89。根据高级放射科医生的说法，具有这种性能的模型足以用于临床使用。数据和代码可在https://github.com/ucsd-ai4h/covid-ct上获得

During the outbreak time of COVID-19, computed tomography (CT) is a useful manner for diagnosing COVID-19 patients. Due to privacy issues, publicly available COVID-19 CT datasets are highly difficult to obtain, which hinders the research and development of AI-powered diagnosis methods of COVID-19 based on CTs. To address this issue, we build an open-sourced dataset -- COVID-CT, which contains 349 COVID-19 CT images from 216 patients and 463 non-COVID-19 CTs. The utility of this dataset is confirmed by a senior radiologist who has been diagnosing and treating COVID-19 patients since the outbreak of this pandemic. We also perform experimental studies which further demonstrate that this dataset is useful for developing AI-based diagnosis models of COVID-19. Using this dataset, we develop diagnosis methods based on multi-task learning and self-supervised learning, that achieve an F1 of 0.90, an AUC of 0.98, and an accuracy of 0.89. According to the senior radiologist, models with such performance are good enough for clinical usage. The data and code are available at https://github.com/UCSD-AI4H/COVID-CT

下载PDF全文

下载文献需遵守相关版权规定

论文标题