自动头和颈淋巴结级别的深度学习提供了专家级的准确性

论文标题

自动头和颈淋巴结级别的深度学习提供了专家级的准确性

Deep learning for automatic head and neck lymph node level delineation provides expert-level accuracy

论文作者

Weissmann, Thomas, Huang, Yixing, Fischer, Stefan, Roesch, Johannes, Mansoorian, Sina, Gaona, Horacio Ayala, Gostian, Antoniu-Oreste, Hecht, Markus, Lettmaier, Sebastian, Deloch, Lisa, Frey, Benjamin, Gaipl, Udo S., Distel, Luitpold V., Maier, Andreas, Iro, Heinrich, Semrau, Sabine, Bert, Christoph, Fietkau, Rainer, Putz, Florian

论文摘要

背景：基于深度学习（DL）的头颈淋巴结水平（HN_LNL）自动纤维与放射疗法研究和临床治疗计划具有很高的相关性，但在学术文献中仍然不足。方法：使用35个规划CTS的专家划分的队列用于培训NNU-NEN 3D-FULLES/2D-ENEBLEN模型，用于自动分组20种不同的HN_LNL。后来在同一机构中获得的第二个队列作为测试集（n = 20）。在一项完全盲目的评估中，3位临床专家在与专家创建的轮廓的正面比较中评估了DL自动分量的质量。对于10个病例的亚组，将观察者内的变异性与原始和重新接收的专家分割集的平均DL自动分量精度进行了比较。引入了对CT切片平面的水平自动分量调整颅底界限的后处理步骤，并研究了对几何精度和专家评级的影响。结果：DL分割和专家创建轮廓的盲目专家评级没有显着差异。具有切片平面调节的DL分割在数值上的额定值（平均值为81.0 vs. 79.6，p = 0.185），而没有切片平面调整的DL分割在数值上比手动绘制的轮廓级别低（77.2 vs. 79.6，p = 0.167）。用CT切片平面调节的DL分割明显优于没有切片平面调节的DL轮廓（81.0 vs. 77.2，p = 0.004）。 DL分割的几何准确性与观察者内变异性没有差异（平均0.76 vs. 0.77，p = 0.307）。结论：我们表明，NNU-NET 3D-FULLRES/2D-ENEMELLE模型可用于高度准确的HN_LNL自动限制，仅使用有限的训练数据集，该数据集理想地适合于研究环境中HN_LNL的大规模标准化自动限制。

Background: Deep learning (DL)-based head and neck lymph node level (HN_LNL) autodelineation is of high relevance to radiotherapy research and clinical treatment planning but still underinvestigated in academic literature. Methods: An expert-delineated cohort of 35 planning CTs was used for training of an nnU-net 3D-fullres/2D-ensemble model for autosegmentation of 20 different HN_LNL. A second cohort acquired at the same institution later in time served as the test set (n=20). In a completely blinded evaluation, 3 clinical experts rated the quality of DL autosegmentations in a head-to-head comparison with expert-created contours. For a subgroup of 10 cases, intraobserver variability was compared to the average DL autosegmentation accuracy on the original and recontoured set of expert segmentations. A postprocessing step to adjust craniocaudal boundaries of level autosegmentations to the CT slice plane was introduced and the effect on geometric accuracy and expert rating was investigated. Results: Blinded expert ratings for DL segmentations and expert-created contours were not significantly different. DL segmentations with slice plane adjustment were rated numerically higher (mean, 81.0 vs. 79.6,p=0.185) and DL segmentations without slice plane adjustment were rated numerically lower (77.2 vs. 79.6,p=0.167) than manually drawn contours. DL segmentations with CT slice plane adjustment were rated significantly better than DL contours without slice plane adjustment (81.0 vs. 77.2,p=0.004). Geometric accuracy of DL segmentations was not different from intraobserver variability (mean, 0.76 vs. 0.77, p=0.307). Conclusions: We show that a nnU-net 3D-fullres/2D-ensemble model can be used for highly accurate autodelineation of HN_LNL using only a limited training dataset that is ideally suited for large-scale standardized autodelineation of HN_LNL in the research setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题