论文标题
通过积极学习改善地震数据的质量控制
Improving the quality control of seismic data through active learning
论文作者
论文摘要
在图像降解问题中,可用图像的密度不断增加,因此不可能进行详尽的视觉检查,因此必须为此目的部署基于机器学习的自动化方法。在地震信号处理中,这是特别的情况。工程师/地球物理学家必须处理数百万的地震时间序列。发现对石油行业有用的地下属性可能需要长达一年的时间,并且在计算/人力资源方面非常昂贵。特别是,数据必须经历不同的噪声衰减步骤。然后,每个DENOISE步骤都是理想的,然后是通过人类专业知识执行的质量控制(QC)阶段。要以有监督的方式学习质量控制分类器,必须提供标记的培训数据,但是从人类专家那里收集标签非常耗时。因此,我们提出了一种新型的主动学习方法,以顺序选择最相关的数据,然后将其归还给人类专家进行标记。除了地球物理学的应用之外,我们在本文中提出的技术基于局部误差及其不确定性的估计是通用的。正如本文提出的数值实验所说明的那样,它的表现得到了强有力的经验证据的支持,在该数值实验中,它与合成和真实地震数据集的替代主动学习策略进行了比较。
In image denoising problems, the increasing density of available images makes an exhaustive visual inspection impossible and therefore automated methods based on machine-learning must be deployed for this purpose. This is particulary the case in seismic signal processing. Engineers/geophysicists have to deal with millions of seismic time series. Finding the sub-surface properties useful for the oil industry may take up to a year and is very costly in terms of computing/human resources. In particular, the data must go through different steps of noise attenuation. Each denoise step is then ideally followed by a quality control (QC) stage performed by means of human expertise. To learn a quality control classifier in a supervised manner, labeled training data must be available, but collecting the labels from human experts is extremely time-consuming. We therefore propose a novel active learning methodology to sequentially select the most relevant data, which are then given back to a human expert for labeling. Beyond the application in geophysics, the technique we promote in this paper, based on estimates of the local error and its uncertainty, is generic. Its performance is supported by strong empirical evidence, as illustrated by the numerical experiments presented in this article, where it is compared to alternative active learning strategies both on synthetic and real seismic datasets.