论文标题
使用基于模块化的基于集合的神经网络来预测热阳光Sunyaev-zel'dovich场
Predicting the Thermal Sunyaev-Zel'dovich Field using Modular and Equivariant Set-Based Neural Networks
论文作者
论文摘要
理论不确定性限制了我们从诸如Thermal Sunyaev-Zel'Dovich(TSZ)效应等重的宇宙学信息中提取宇宙学信息的能力。 TSZ效应是由电子压力场提出的,取决于通常由昂贵的流体动力模拟建模的男性物理学。我们在Illustristng-300宇宙学模拟上训练神经网络,以预测仅重力模拟的星系簇中的连续电子压力场。建模簇对神经网络具有挑战性,因为大多数气体压力集中在少数体素中,甚至最大的流体动力模拟只包含几百个可以用于训练的簇。我们选择采用旋转型的深层体系结构直接在暗物质粒子的集合上运行,而不是常规的卷积神经网(CNN)架构。我们认为,基于集合的体系结构比CNN具有不同的优势。例如,我们可以执行精确的旋转和置换量比,并在TSZ场上纳入现有知识,并与宇宙学标准的稀疏领域一起工作。我们使用单独的,物理上有意义的模块组成架构,使其可以解释。例如,我们可以分别研究局部和集群规模环境的影响,确定簇三轴性具有可忽略的影响,并训练一个纠正错误居中的模块。我们的模型在适合相同仿真数据的分析概况上提高了70%。我们认为,电子压力场被视为仅重力模拟的函数,具有固有的随机性,并通过向网络的条件vae扩展进行建模。这种修饰可进一步提高7%,但受我们的小型培训集的限制。 (简略)
Theoretical uncertainty limits our ability to extract cosmological information from baryonic fields such as the thermal Sunyaev-Zel'dovich (tSZ) effect. Being sourced by the electron pressure field, the tSZ effect depends on baryonic physics that is usually modeled by expensive hydrodynamic simulations. We train neural networks on the IllustrisTNG-300 cosmological simulation to predict the continuous electron pressure field in galaxy clusters from gravity-only simulations. Modeling clusters is challenging for neural networks as most of the gas pressure is concentrated in a handful of voxels and even the largest hydrodynamical simulations contain only a few hundred clusters that can be used for training. Instead of conventional convolutional neural net (CNN) architectures, we choose to employ a rotationally equivariant DeepSets architecture to operate directly on the set of dark matter particles. We argue that set-based architectures provide distinct advantages over CNNs. For example, we can enforce exact rotational and permutation equivariance, incorporate existing knowledge on the tSZ field, and work with sparse fields as are standard in cosmology. We compose our architecture with separate, physically meaningful modules, making it amenable to interpretation. For example, we can separately study the influence of local and cluster-scale environment, determine that cluster triaxiality has negligible impact, and train a module that corrects for mis-centering. Our model improves by 70 % on analytic profiles fit to the same simulation data. We argue that the electron pressure field, viewed as a function of a gravity-only simulation, has inherent stochasticity, and model this property through a conditional-VAE extension to the network. This modification yields further improvement by 7 %, it is limited by our small training set however. (abridged)