论文标题
使用深度学习从不完整数据中获得的类星体光度降轴
Quasar photometric redshifts from incomplete data using Deep Learning
论文作者
论文摘要
预计即将进行的天文学调查将以大量的测量来检测新来源,以至于测量其光谱红移测量将是不切实际的。因此,使用机器学习从每个对象的光度法产生红移非常有兴趣。我们对用平方公里阵列检测到的无线电源(类星体)特别感兴趣,并发现了深度学习,对大型光学选择的准恒星对象样品进行了训练,可以在三个无线电选择源的外部样本中预测红移。然而,从近红外,光学和超紫罗兰带中的九个不同幅度的需求具有显着减少可以预测红移的来源数量的作用。在这里,我们探讨了使用机器学习算上缺失功能的可能性。我们发现,对于训练样本,简单的插补就足够了,尤其是用该带的最大值代替缺失的幅度,因此假定非检测处于灵敏度极限。但是,对于测试样本,这不像多元插补那样执行,这表明许多缺失的幅度不是限制,但确实没有观察到。通过对模型的广泛测试,我们建议该插补最好限于每个源的两个缺失值。在最坏的情况下,源在天空上重叠的地方,这增加了红移可以从46%到80%的来源的比例,而其他样品则达到> 90%。
Forthcoming astronomical surveys are expected to detect new sources in such large numbers that measuring their spectroscopic redshift measurements will be not be practical. Thus, there is much interest in using machine learning to yield the redshift from the photometry of each object. We are particularly interested in radio sources (quasars) detected with the Square Kilometre Array and have found Deep Learning, trained upon a large optically-selected sample of quasi-stellar objects, to be effective in the prediction of the redshifts in three external samples of radio-selected sources. However, the requirement of nine different magnitudes, from the near-infrared, optical and ultra-violet bands, has the effect of significantly reducing the number of sources for which redshifts can be predicted. Here we explore the possibility of using machine learning to impute the missing features. We find that for the training sample, simple imputation is sufficient, particularly replacing the missing magnitude with the maximum for that band, thus presuming that the non-detection is at the sensitivity limit. For the test samples, however, this does not perform as well as multivariate imputation, which suggests that many of the missing magnitudes are not limits, but have indeed not been observed. From extensive testing of the models, we suggest that the imputation is best restricted to two missing values per source. Where the sources overlap on the sky, in the worst case, this increases the fraction of sources for which redshifts can be estimated from 46% to 80%, with >90% being reached for the other samples.