用于大规模混合功能的插值分离密度拟合算法的复杂值k-均值启用了\ textit \ textit {ab intio} Molecular Dynamics模拟

论文标题

用于大规模混合功能的插值分离密度拟合算法的复杂值k-均值启用了\ textit \ textit {ab intio} Molecular Dynamics模拟

Complex-valued K-means clustering of interpolative separable density fitting algorithm for large-scale hybrid functional enabled \textit{ab initio} molecular dynamics simulations within plane waves

论文作者

Jiao, Shizhe, Li, Jielan, Qin, Xinming, Wan, Lingyun, Hu, Wei, Yang, Jinlong

论文摘要

作为经典无监督的机器学习算法，K-均值聚类是在插值可分离密度拟合（ISDF）分解中选择插值采样点的关键步骤。用于加速ISDF分解的实值K-均值聚集已被证明，用于大规模混合功能\ textit \ textIt {ab libio}分子动力学（混合AIMD）在平面波基集中的KOHN-SHAM ORBITALS内的位置内进行了计算。但是，目前尚不清楚此类K-均值聚类是否适用于复杂价值的Kohn-Sham轨道。在这里，我们将K-均值聚类应用于复合价值的Kohn-Sham轨道的混合AIMD模拟中，并使用改进的重量函数定义为K-Means Clustering中复杂值Kohn-Sham轨道的平方模量的总和。数值结果表明，K-Means聚类算法中这种改善的重量功能可以使更顺畅，更更平稳地插值抽样点，从而使Hybrid AIMD模拟的较小的能量潜力，较小的能量漂移和与现实价值的K-Meanns Algoriths相比的混合AIMD模拟的更长的时间步骤。特别地，我们发现与以前的K-均值算法相比，这种改进的算法可以在液态水分子中获得更准确的氧氧径向分布功能，并在二氧化碳晶体硅晶体硅中获得更精确的功率谱。最后，我们描述了该ISDF分解的大规模平行实现，以加速包含数千个原子（2,744个原子）的大型复合物值混合AIMD模拟，该模拟可以扩展到现代超级计算机上的5,504个CPU核心。

K-means clustering, as a classic unsupervised machine learning algorithm, is the key step to select the interpolation sampling points in interpolative separable density fitting (ISDF) decomposition. Real-valued K-means clustering for accelerating the ISDF decomposition has been demonstrated for large-scale hybrid functional enabled \textit{ab initio} molecular dynamics (hybrid AIMD) simulations within plane-wave basis sets where the Kohn-Sham orbitals are real-valued. However, it is unclear whether such K-means clustering works for complex-valued Kohn-Sham orbitals. Here, we apply the K-means clustering into hybrid AIMD simulations for complex-valued Kohn-Sham orbitals and use an improved weight function defined as the sum of the square modulus of complex-valued Kohn-Sham orbitals in K-means clustering. Numerical results demonstrate that this improved weight function in K-means clustering algorithm yields smoother and more delocalized interpolation sampling points, resulting in smoother energy potential, smaller energy drift and longer time steps for hybrid AIMD simulations compared to the previous weight function used in the real-valued K-means algorithm. In particular, we find that this improved algorithm can obtain more accurate oxygen-oxygen radial distribution functions in liquid water molecules and more accurate power spectrum in crystal silicon dioxide compared to the previous K-means algorithm. Finally, we describe a massively parallel implementation of this ISDF decomposition to accelerate large-scale complex-valued hybrid AIMD simulations containing thousands of atoms (2,744 atoms), which can scale up to 5,504 CPU cores on modern supercomputers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题