与机器学习和符号回归建模组装偏差

论文标题

与机器学习和符号回归建模组装偏差

Modeling assembly bias with machine learning and symbolic regression

论文作者

Wadekar, Digvijay, Villaescusa-Navarro, Francisco, Ho, Shirley, Perreault-Levasseur, Laurence

论文摘要

即将进行的21厘米调查将在前所未有的体积上绘制宇宙中性氢（HI）的空间分布。需要模拟目录以充分利用这些调查的潜力。用于创建这些模拟目录（例如光环占用分布（HOD））的标准技术依赖于诸如暗物质光环的重型属性之类的假设仅取决于其质量。在这项工作中，我们使用最先进的磁动力模拟插图来表明HAROS的HI含量对其本地环境表现出很大的依赖。然后，我们使用机器学习技术来证明这种效果可以是1）以这些算法为模型，而2）以新颖的分析方程式进行了参数。我们为这种环境效应提供了物理解释，并表明忽略它会导致$ k \ gtrsim的真实空间21 cm功率谱的预测不足，$ \ gtrsim $ \ gtrsim $ 10 \％，这比即将在如此庞大的规模上进行的调查的预期精度大。我们将数值模拟与机器学习技术相结合的方法是一般的，并在建模和参数化组件的复杂物理方向上打开了一个新的方向，以生成用于星系和线强度映射调查的准确模拟所需的模拟。

Upcoming 21cm surveys will map the spatial distribution of cosmic neutral hydrogen (HI) over unprecedented volumes. Mock catalogues are needed to fully exploit the potential of these surveys. Standard techniques employed to create these mock catalogs, like Halo Occupation Distribution (HOD), rely on assumptions such as the baryonic properties of dark matter halos only depend on their masses. In this work, we use the state-of-the-art magneto-hydrodynamic simulation IllustrisTNG to show that the HI content of halos exhibits a strong dependence on their local environment. We then use machine learning techniques to show that this effect can be 1) modeled by these algorithms and 2) parametrized in the form of novel analytic equations. We provide physical explanations for this environmental effect and show that ignoring it leads to underprediction of the real-space 21-cm power spectrum at $k\gtrsim 0.05$ h/Mpc by $\gtrsim$10\%, which is larger than the expected precision from upcoming surveys on such large scales. Our methodology of combining numerical simulations with machine learning techniques is general, and opens a new direction at modeling and parametrizing the complex physics of assembly bias needed to generate accurate mocks for galaxy and line intensity mapping surveys.

下载PDF全文

下载文献需遵守相关版权规定

论文标题