病理视觉问题回答

论文标题

病理视觉问题回答

Pathological Visual Question Answering

论文作者

He, Xuehai, Cai, Zhuo, Wei, Wenlan, Zhang, Yichen, Mou, Luntian, Xing, Eric, Xie, Pengtao

论文摘要

是否可以开发“ AI病理学家”通过对美国病理委员会（ABP）的董事会认证检查？要构建这样的系统，需要解决三个挑战。首先，我们需要创建一个视觉问题回答（VQA）数据集，其中AI代理带有病理图像以及一个问题，并要求给出正确的答案。由于隐私问题，病理图像通常无法公开可用。此外，只有训练有素的病理学家才能理解病理图像，但是他们几乎没有时间帮助创建AI研究的数据集。第二个挑战是：由于很难雇用经验丰富的病理学家来创建病理学问题和答案，因此由此产生的病理VQA数据集可能包含错误。使用这些嘈杂甚至错误数据的训练病理VQA模型将导致有问题的模型，这些模型无法很好地概括在看不见的图像上。第三个挑战是：病理学问题涵盖的医学概念和知识对（QA）对非常多样化，而可用于建模训练的QA对数量有限。如何基于有限的数据来学习各种医学概念的有效表示。在本文中，我们旨在应对这三个挑战。据我们所知，我们的工作代表了第一个解决病理VQA问题的工作。为了处理缺少公开可用病理数据集的问题，我们创建了PATHVQA数据集。为了应对第二个挑战，我们提出了一种按命令学习方法。为了应对第三个挑战，我们建议使用跨模式的自我监督学习。我们对创建的PATHVQA数据集执行实验，结果证明了我们提出的按签名方法和跨模式的自我监督学习方法的有效性。

Is it possible to develop an "AI Pathologist" to pass the board-certified examination of the American Board of Pathology (ABP)? To build such a system, three challenges need to be addressed. First, we need to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer. Due to privacy concerns, pathology images are usually not publicly available. Besides, only well-trained pathologists can understand pathology images, but they barely have time to help create datasets for AI research. The second challenge is: since it is difficult to hire highly experienced pathologists to create pathology visual questions and answers, the resulting pathology VQA dataset may contain errors. Training pathology VQA models using these noisy or even erroneous data will lead to problematic models that cannot generalize well on unseen images. The third challenge is: the medical concepts and knowledge covered in pathology question-answer (QA) pairs are very diverse while the number of QA pairs available for modeling training is limited. How to learn effective representations of diverse medical concepts based on limited data is technically demanding. In this paper, we aim to address these three challenges. To our best knowledge, our work represents the first one addressing the pathology VQA problem. To deal with the issue that a publicly available pathology VQA dataset is lacking, we create PathVQA dataset. To address the second challenge, we propose a learning-by-ignoring approach. To address the third challenge, we propose to use cross-modal self-supervised learning. We perform experiments on our created PathVQA dataset and the results demonstrate the effectiveness of our proposed learning-by-ignoring method and cross-modal self-supervised learning methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题