论文标题
朝着科学计算的强大深入积极学习
Towards Robust Deep Active Learning for Scientific Computing
论文作者
论文摘要
深度学习(DL)正在彻底改变科学计算社区。为了缩小数据差距,已经将主动学习确定为科学计算社区中DL的有前途解决方案。但是,深度积极学习(DAL)文献以图像分类问题和基于池的方法为主。在这里,我们研究了越来越多地使用DNN的科学计算问题(由回归主导)的基于池的DAL方法的鲁棒性。我们表明,现代基于池的DAL方法都共享一个不可降低的超参数,称为泳池比,表示为$γ$,通常认为这在文献中是已知的Apriori。如果我们假设已知的$γ$是\ textit {not},我们将评估六个基准问题的五种最先进的DAL方法的性能 - 对于科学计算问题而言,这是一个更现实的假设。我们的结果表明,这会降低现代DAL方法的性能,并且有时甚至比随机抽样更糟,在现实世界中使用时会产生明显的不确定性。为了克服这一限制,我们据我们所知,我们提出的第一个查询合成DAL方法称为Na-QBC。 NA-QBC删除了敏感的$γ$超参数,我们发现,平均而言,它在基准问题上的其他DAL方法都优于其他DAL方法。至关重要的是,NA-QBC始终优于随机抽样,提供更强大的性能优势。
Deep learning (DL) is revolutionizing the scientific computing community. To reduce the data gap, active learning has been identified as a promising solution for DL in the scientific computing community. However, the deep active learning (DAL) literature is dominated by image classification problems and pool-based methods. Here we investigate the robustness of pool-based DAL methods for scientific computing problems (dominated by regression) where DNNs are increasingly used. We show that modern pool-based DAL methods all share an untunable hyperparameter, termed the pool ratio, denoted $γ$, which is often assumed to be known apriori in the literature. We evaluate the performance of five state-of-the-art DAL methods on six benchmark problems if we assume $γ$ is \textit{not} known - a more realistic assumption for scientific computing problems. Our results indicate that this reduces the performance of modern DAL methods and that they sometimes can even perform worse than random sampling, creating significant uncertainty when used in real-world settings. To overcome this limitation we propose, to our knowledge, the first query synthesis DAL method for regression, termed NA-QBC. NA-QBC removes the sensitive $γ$ hyperparameter and we find that, on average, it outperforms the other DAL methods on our benchmark problems. Crucially, NA-QBC always outperforms random sampling, providing more robust performance benefits.