论文标题

GPU的移植和优化UNIFRAC

Porting and optimizing UniFrac for GPUs

论文作者

Sfiligoi, Igor, McDonald, Daniel, Knight, Rob

论文摘要

Unifrac是微生物组研究中常用的度量,用于将微生物组谱相互比较(“ Beta多样性”)。最近实现的条纹Unifrac增加了将问题分为许多独立的子问题和线性缩放附近的展览的能力。在本文中,我们描述了在移植和优化GPU的条纹Unifrac时采取的步骤。我们将计算在已发表的Earth Microbiome项目数据集中计算UNIFRAC的运行时间从Intel Xeon E5-2680 V4 CPU上的13小时减少到NVIDIA TESLA V100 GPU上的12分钟,而在具有NVIDIA GTX 1050的laptop上(在精度上损失较小)。在包含113K样品的较大数据集上计算UNIFRAC将运行时间从CPU上的一个月缩短到V100的v100和9小时的NVIDIA RTX 2080TI GPU(精密损失)。这是通过使用OpenACC生成GPU卸载代码并改善内存访问模式来实现的。提供了BSD许可的实现,该实现可通过任何编程语言链接产生C共享库。

UniFrac is a commonly used metric in microbiome research for comparing microbiome profiles to one another ("beta diversity"). The recently implemented Striped UniFrac added the capability to split the problem into many independent subproblems and exhibits near linear scaling. In this paper we describe steps undertaken in porting and optimizing Striped Unifrac to GPUs. We reduced the run time of computing UniFrac on the published Earth Microbiome Project dataset from 13 hours on an Intel Xeon E5-2680 v4 CPU to 12 minutes on an NVIDIA Tesla V100 GPU, and to about one hour on a laptop with NVIDIA GTX 1050 (with minor loss in precision). Computing UniFrac on a larger dataset containing 113k samples reduced the run time from over one month on the CPU to less than 2 hours on the V100 and 9 hours on an NVIDIA RTX 2080TI GPU (with minor loss in precision). This was achieved by using OpenACC for generating the GPU offload code and by improving the memory access patterns. A BSD-licensed implementation is available, which produces a C shared library linkable by any programming language.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源