论文标题
受机器学习产生的假设启发的科学直觉
Scientific intuition inspired by machine learning generated hypotheses
论文作者
论文摘要
在物理科学中使用问题的机器学习已成为一种广泛使用的工具,成功地应用于许多领域的分类,回归和优化任务。研究重点主要在于在数值预测中提高机器学习模型的准确性,而科学理解几乎是由分析数值结果和得出结论得出的人类研究人员所产生的。在这项工作中,我们将重点放在机器学习模型本身获得的见解和知识上。特别是,我们研究了如何提取并用来激发人类科学家的直觉和对自然系统的理解。我们在决策树中应用梯度提升来从化学和物理学中提取人类可解释的见解。在化学方面,我们不仅重新发现了经验法则,而且还发现了新的有趣的主题,这些图案告诉我们如何控制有机分子的溶解度和能级。同时,在量子物理学中,我们对量子纠缠实验有了新的了解。超越数字和进入科学见解和假设生成领域的能力打开了使用机器学习来加速在一些最具挑战性的科学领域中发现概念理解的大门。
Machine learning with application to questions in the physical sciences has become a widely used tool, successfully applied to classification, regression and optimization tasks in many areas. Research focus mostly lies in improving the accuracy of the machine learning models in numerical predictions, while scientific understanding is still almost exclusively generated by human researchers analysing numerical results and drawing conclusions. In this work, we shift the focus on the insights and the knowledge obtained by the machine learning models themselves. In particular, we study how it can be extracted and used to inspire human scientists to increase their intuitions and understanding of natural systems. We apply gradient boosting in decision trees to extract human interpretable insights from big data sets from chemistry and physics. In chemistry, we not only rediscover widely know rules of thumb but also find new interesting motifs that tell us how to control solubility and energy levels of organic molecules. At the same time, in quantum physics, we gain new understanding on experiments for quantum entanglement. The ability to go beyond numerics and to enter the realm of scientific insight and hypothesis generation opens the door to use machine learning to accelerate the discovery of conceptual understanding in some of the most challenging domains of science.