论文标题
在黑匣子内偷看:解释系外星的深度学习模型
Peeking inside the Black Box: Interpreting Deep Learning Models for Exoplanet Atmospheric Retrievals
论文作者
论文摘要
深度学习算法在系外行星科学领域的流行越来越高,因为它们能够建模高度非线性关系并以数据驱动的方式解决有趣的问题。几项作品试图通过使用机器学习算法(如深神经网络(DNN))来快速检索大气参数。然而,尽管具有很高的预测能力,但DNN也因“黑匣子”而臭名昭著。正是他们显然缺乏解释性,使天体物理学界不愿采用它们。他们的预测是根据什么?我们应该对他们有多信心?他们什么时候错了,有多错?在这项工作中,我们提出了许多一般评估方法,可以应用于任何训练有素的模型并回答这样的问题。特别是,我们训练三种不同流行的DNN体系结构,从系外行星光谱中检索大气参数,并表明这三个参数具有良好的预测性能。然后,我们对DNN的预测进行了广泛的分析,该预测可以告知我们(其他方面)给定仪器和模型的大气参数的可信度限制。最后,我们执行基于扰动的灵敏度分析,以确定频谱的哪些特征最敏感的结果。我们得出的结论是,对于不同的分子,DNN预测最敏感的波长范围确实与其特征吸收区相吻合。这项工作中介绍的方法有助于改善对DNN的评估并赋予其预测的解释性。
Deep learning algorithms are growing in popularity in the field of exoplanetary science due to their ability to model highly non-linear relations and solve interesting problems in a data-driven manner. Several works have attempted to perform fast retrievals of atmospheric parameters with the use of machine learning algorithms like deep neural networks (DNNs). Yet, despite their high predictive power, DNNs are also infamous for being 'black boxes'. It is their apparent lack of explainability that makes the astrophysics community reluctant to adopt them. What are their predictions based on? How confident should we be in them? When are they wrong and how wrong can they be? In this work, we present a number of general evaluation methodologies that can be applied to any trained model and answer questions like these. In particular, we train three different popular DNN architectures to retrieve atmospheric parameters from exoplanet spectra and show that all three achieve good predictive performance. We then present an extensive analysis of the predictions of DNNs, which can inform us - among other things - of the credibility limits for atmospheric parameters for a given instrument and model. Finally, we perform a perturbation-based sensitivity analysis to identify to which features of the spectrum the outcome of the retrieval is most sensitive. We conclude that for different molecules, the wavelength ranges to which the DNN's predictions are most sensitive, indeed coincide with their characteristic absorption regions. The methodologies presented in this work help to improve the evaluation of DNNs and to grant interpretability to their predictions.