论文标题
利用全球二进制口罩进行医学图像中的结构细分
Leveraging Global Binary Masks for Structure Segmentation in Medical Images
论文作者
论文摘要
用于医学图像分割的深度学习模型受输入图像的强度变化的高度影响,并且由于主要利用像素的强度信息进行推理而缺乏概括。获取足够的培训数据是限制模型应用程序的另一个挑战。我们提议利用器官解剖形状和医学图像中位置信息的一致性。我们引入了一个通过全球二进制掩码进行器官分割的框架,该框架利用了重复的解剖模式。研究了两种情况。1)全球二进制掩码是唯一的模型(即U-NET)输入,迫使独家编码器官的位置和形状信息进行分割/本地化。2)全球二进制掩码被作为额外的频道功能作为位置/形状线索的附加通道功能,以减轻训练数据障碍。大脑和心脏CT图像的两个数据集及其地面真相分别分别为(26:10:10)和(12:3:5)进行训练,验证和测试。仅对全球二进制面具进行培训,导致骰子得分为0.77(0.06)和0.85(0.04),平均欧几里得距离分别为3.12(1.43)mm和2.5(0.93)mm,相对于大脑和心脏结构的地面真相质量中心。结果表明,通过全球二进制面具编码了令人惊讶的位置和形状信息。相对于仅在训练数据的少数子集中训练的模型,相对于仅在CT图像上训练的模型,整合全球二进制面具的精度明显更高。 1-8个大脑和心脏数据集的训练案例分别提高了4.3-125.3%和1.3-48.1%。这些发现意味着利用全球二进制面具来构建可推广模型并弥补培训数据稀缺的优势。
Deep learning (DL) models for medical image segmentation are highly influenced by intensity variations of input images and lack generalization due to primarily utilizing pixels' intensity information for inference. Acquiring sufficient training data is another challenge limiting models' applications. We proposed to leverage the consistency of organs' anatomical shape and position information in medical images. We introduced a framework leveraging recurring anatomical patterns through global binary masks for organ segmentation. Two scenarios were studied.1) Global binary masks were the only model's (i.e. U-Net) input, forcing exclusively encoding organs' position and shape information for segmentation/localization.2) Global binary masks were incorporated as an additional channel functioning as position/shape clues to mitigate training data scarcity. Two datasets of the brain and heart CT images with their ground-truth were split into (26:10:10) and (12:3:5) for training, validation, and test respectively. Training exclusively on global binary masks led to Dice scores of 0.77(0.06) and 0.85(0.04), with the average Euclidian distance of 3.12(1.43)mm and 2.5(0.93)mm relative to the center of mass of the ground truth for the brain and heart structures respectively. The outcomes indicate that a surprising degree of position and shape information is encoded through global binary masks. Incorporating global binary masks led to significantly higher accuracy relative to the model trained on only CT images in small subsets of training data; the performance improved by 4.3-125.3% and 1.3-48.1% for 1-8 training cases of the brain and heart datasets respectively. The findings imply the advantages of utilizing global binary masks for building generalizable models and to compensate for training data scarcity.