论文标题

PETICXR:一个开放的大规模胸部X光片数据集,用于解释儿童常见胸部疾病

PediCXR: An open, large-scale chest radiograph dataset for interpretation of common thoracic diseases in children

论文作者

Pham, Hieu H., Nguyen, Ngoc H., Tran, Thanh T., Nguyen, Tuan N. M., Nguyen, Ha Q.

论文摘要

由于缺乏高质量的医师注销的数据集,进行了用于检测和诊断儿科疾病的诊断模型的开发。为了克服这一挑战,我们介绍和发布Pedicxr,这是一个新的儿科CXR数据集,对2020年至2021年之间从越南的一家主要儿科医院进行了回顾性收集的9,125项研究。每次扫描都是由一项经验超过十年的儿科射线科医师手动注释。该数据集被标记为存在36个关键发现和15种疾病。特别是,通过图像上的矩形边界框确定了每个异常发现。据我们所知,这是第一个和最大的儿科CXR数据集,其中包含病变级注释和图像级标签,用于检测多种发现和疾病。对于算法开发,数据集分为7,728的训练集,测试集为1,397。为了鼓励使用数据驱动的方法来鼓励小儿CXR解释的新进展,我们提供了PEDICXR数据样本的详细说明,并在https://physionet.org/content/pedicxr/1.0.0.0/上公开提供数据集

The development of diagnostic models for detecting and diagnosing pediatric diseases in CXR scans is undertaken due to the lack of high-quality physician-annotated datasets. To overcome this challenge, we introduce and release PediCXR, a new pediatric CXR dataset of 9,125 studies retrospectively collected from a major pediatric hospital in Vietnam between 2020 and 2021. Each scan was manually annotated by a pediatric radiologist with more than ten years of experience. The dataset was labeled for the presence of 36 critical findings and 15 diseases. In particular, each abnormal finding was identified via a rectangle bounding box on the image. To the best of our knowledge, this is the first and largest pediatric CXR dataset containing lesion-level annotations and image-level labels for the detection of multiple findings and diseases. For algorithm development, the dataset was divided into a training set of 7,728 and a test set of 1,397. To encourage new advances in pediatric CXR interpretation using data-driven approaches, we provide a detailed description of the PediCXR data sample and make the dataset publicly available on https://physionet.org/content/pedicxr/1.0.0/

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源