与GPU并行化的持续同源性计算的线性运行时间

论文标题

与GPU并行化的持续同源性计算的线性运行时间

Linear Run Time of Persistent Homology Computation with GPU Parallelization

论文作者

Rawson, Michael G.

论文摘要

持续的同源性是一种至关重要的不变，在许多领域都用于了解数据。 $ O（n^4）$运行时间是对大多数大型数据集使用的障碍。我们提供了一种并行化方法来利用多核机和簇。我们实现了与OpenMP并行化的$ 0^{th} $持久同源性的计算，并通过在双核心机器上使用2个线程来观察1.75倍的性能提高。我们还使用大量线程对计算进行基准测试，并表明线程计算开销会降低性能。通过GPU并行化，我们通过分析和经验将运行时间缩放从$ O（n^4）$减少到$ O（n^3）$，甚至$ O（n^2）$，其中$ n $是数据点的数量，对于足够大的GPU。接下来，我们通过分析显示了更大的GPU的运行时间缩放$ O（N）$。

Persistent homology is a crucial invariant that is used in many areas to understand data. The $O(N^4)$ run time is a hindrance to its use on most large datasets. We give a parallelization method to utilize multi-core machines and clusters. We implement the computation of the $0^{th}$ persistent homology with OpenMP parallelization and observe a 1.75 fold performance increase by using 2 threads on a dual core machine. We also benchmark the computation using larger numbers of threads and show that the thread computational overhead decreases performance. With GPU parallelization, we analytically and empirically decrease the run time scaling from $O(N^4)$ to $O(N^3)$ and even $O(N^2)$ where $N$ is the number of data points, for a large enough GPU. Next, we analytically show run time scaling $O(N)$ for an even larger GPU.

下载PDF全文

下载文献需遵守相关版权规定

论文标题