用于图形神经网络学习的数据启发，对未删除结构的轻松能量

论文标题

用于图形神经网络学习的数据启发，对未删除结构的轻松能量

Data-Augmentation for Graph Neural Network Learning of the Relaxed Energies of Unrelaxed Structures

论文作者

Gibson, Jason B., Hire, Ajinkya C., Hennig, Richard G.

论文摘要

由于计算能力和晶体结构预测算法（CSPA）的进步，在过去的十年中，计算材料的发现在过去十年中不断增长。但是，CSPA所需的\ textIt {ab intio}计算的计算成本将其实用性限制为小单元细胞，从而减少了算法可以探索的组成和结构空间。过去的研究绕过了许多不需要的\ textit {ab intio}计算，通过利用机器学习方法来预测形成能量并确定材料的稳定性。具体而言，图神经网络在预测形成能量方面表现出高忠诚。传统上，对大量放松结构的数据集进行了培训神经网络。不幸的是，CSPA产生的未删除的候选结构的几何形状通常偏离了放松状态，这导致预测较差，阻碍了该模型在\ textit {ab in obio}评估之前过滤能量不利的能力。这项工作表明，随着训练的进行，放松结构的预测误差会减少，而对未删除结构的预测误差增加，这表明放松和未删除的结构预测准确性之间存在逆相关性。为了纠正这种行为，我们提出了一种简单的，有力的，计算上便宜的扰动技术，可以增强培训数据，以大大改善对未移动结构的预测。在我们的测试集中，由623个NB-SR-H氢化物结构组成，我们发现训练晶体图卷积神经网络，利用我们的增强方法，将形成能量预测的MAE降低了66 \％，而仅使用松弛结构进行培训。然后，我们通过提高模型准确过滤量过滤的能力来准确地过滤出模型的能力来说明这种误差降低如何加速CSPA。

Computational materials discovery has continually grown in utility over the past decade due to advances in computing power and crystal structure prediction algorithms (CSPA). However, the computational cost of the \textit{ab initio} calculations required by CSPA limits its utility to small unit cells, reducing the compositional and structural space the algorithms can explore. Past studies have bypassed many unneeded \textit{ab initio} calculations by utilizing machine learning methods to predict formation energy and determine the stability of a material. Specifically, graph neural networks display high fidelity in predicting formation energy. Traditionally graph neural networks are trained on large data sets of relaxed structures. Unfortunately, the geometries of unrelaxed candidate structures produced by CSPA often deviate from the relaxed state, which leads to poor predictions hindering the model's ability to filter energetically unfavorable prior to \textit{ab initio} evaluation. This work shows that the prediction error on relaxed structures reduces as training progresses, while the prediction error on unrelaxed structures increases, suggesting an inverse correlation between relaxed and unrelaxed structure prediction accuracy. To remedy this behavior, we propose a simple, physically motivated, computationally cheap perturbation technique that augments training data to improve predictions on unrelaxed structures dramatically. On our test set consisting of 623 Nb-Sr-H hydride structures, we found that training a crystal graph convolutional neural networks, utilizing our augmentation method, reduced the MAE of formation energy prediction by 66\% compared to training with only relaxed structures. We then show how this error reduction can accelerates CSPA by improving the model's ability to filter out energetically unfavorable structures accurately.

下载PDF全文

下载文献需遵守相关版权规定

论文标题