论文标题
ChexStray:用于医学成像中漂移检测的实时多模式数据一致性AI
CheXstray: Real-time Multi-Modal Data Concordance for Drift Detection in Medical Imaging AI
论文作者
论文摘要
临床人工Lntelligence(AI)应用程序正在迅速扩大,并有可能影响所有医疗实践领域。医学成像应用构成了绝大多数认可的临床AI应用。尽管医疗保健系统渴望采用AI解决方案的基本问题仍然存在:\ textit {AI模型进入生产后发生了什么?}我们使用CHEXPERT和PADCHEST公共数据集来构建和测试医学成像AI漂移监测工作流程,以跟踪数据和模型漂移而不同时地面真相。我们模拟了多个实验中的漂移,以将模型性能与我们的新型多模式漂移度量标准进行比较,该指标使用DICOM元数据,来自变异自动编码器(VAE)的图像外观表示形式以及模型输出概率作为输入。通过实验,我们使用相关元数据,预测的概率和VAE潜在代表性中的无监督分布变化来证明地面真相性能的强大代表。我们的主要贡献包括(1)医学成像漂移检测的概念证明,其中包括使用VAE和域特定的统计方法,(2)一种多模式的方法,用于测量和统一漂移指标,(3)对挑战和解决方案的新见解,用于观察AI的挑战和解决方案,以观察AI的ai和(4)可轻松地进行操作的工具,以自动运行其他工作。这项工作具有重要的含义。它解决了在动态医疗环境中常见的连续医学成像模型中发现的有关翻译差距。
Clinical Artificial lntelligence (AI) applications are rapidly expanding worldwide, and have the potential to impact to all areas of medical practice. Medical imaging applications constitute a vast majority of approved clinical AI applications. Though healthcare systems are eager to adopt AI solutions a fundamental question remains: \textit{what happens after the AI model goes into production?} We use the CheXpert and PadChest public datasets to build and test a medical imaging AI drift monitoring workflow to track data and model drift without contemporaneous ground truth. We simulate drift in multiple experiments to compare model performance with our novel multi-modal drift metric, which uses DICOM metadata, image appearance representation from a variational autoencoder (VAE), and model output probabilities as input. Through experimentation, we demonstrate a strong proxy for ground truth performance using unsupervised distributional shifts in relevant metadata, predicted probabilities, and VAE latent representation. Our key contributions include (1) proof-of-concept for medical imaging drift detection that includes the use of VAE and domain specific statistical methods, (2) a multi-modal methodology to measure and unify drift metrics, (3) new insights into the challenges and solutions to observe deployed medical imaging AI, and (4) creation of open-source tools that enable others to easily run their own workflows and scenarios. This work has important implications. It addresses the concerning translation gap found in continuous medical imaging AI model monitoring common in dynamic healthcare environments.