论文标题
重新思考概括:注释样式对医疗图像细分的影响
Rethinking Generalization: The Impact of Annotation Style on Medical Image Segmentation
论文作者
论文摘要
概括是机器学习模型的重要属性,尤其是对于要在医疗环境中部署的人,在医学环境中,不可靠的预测可能会带来现实世界的后果。尽管模型在数据分布中的不匹配通常归因于跨数据集的概括,但性能差距通常是“地面真相”标签注释中偏见的结果。这在病理结构(例如病变)的医学图像分割的背景下尤其重要,其中注释过程更具主观性,并受到许多潜在因素的影响,包括注释方案,评估者教育/经验以及临床目标等。在本文中,我们表明,建模注释偏见而不是忽略它们,而是为跨数据集的注释样式的差异提出了一种有希望的方法。为此,我们建议使用单个模型在多个数据集中学习并说明各个数据集的不同注释样式,(2)识别不同数据集的相似注释样式,以便其有效的聚合,以及(3)将完全训练的模型调整为新的注释样式。接下来,我们提出了一种图像条件方法,用于模型注释样式,该方法与特定图像特征相关联,可能使检测偏见更容易识别。
Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, where unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the 'ground-truth' label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and affected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for differences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for different annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across different datasets in order to permit their effective aggregation, and (3) fine-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with specific image features, potentially enabling detection biases to be more easily identified.