基于DAG的调度，以及用于多gpu运行时多任务应用程序的资源共享

论文标题

基于DAG的调度，以及用于多gpu运行时多任务应用程序的资源共享

DAG-based Scheduling with Resource Sharing for Multi-task Applications in a Polyglot GPU Runtime

论文作者

Parravicini, Alberto, Delamare, Arnaud, Arnaboldi, Marco, Santambrogio, Marco D.

论文摘要

GPU在云计算和个人设备中很容易获得，但是它们与Python或Java等通用编程语言的有限集成在一起，它们用于数据处理加速度的使用已减慢。此外，使用GPU达到其全部功能需要异步编程的专家知识。在这项工作中，我们为多任务GPU计算提供了一种新颖的GPU运行时间调度程序，该计算透明地提供异步执行，空间共享和传输兼容重叠，而无需提前需要有关程序依赖关系结构的任何信息。我们利用Grcuda Polyglot API将调度程序与多种高级语言集成在一起，并为快速原型且易于GPU加速提供了一个平台。我们在创建的6个基准上验证了我们的工作，以评估任务 - 并行性，并在同步执行中平均速度为44％的加速度，与使用C ++ CUDA Graphs API编写的手工精制的主机代码相比，没有执行时间放缓。

GPUs are readily available in cloud computing and personal devices, but their use for data processing acceleration has been slowed down by their limited integration with common programming languages such as Python or Java. Moreover, using GPUs to their full capabilities requires expert knowledge of asynchronous programming. In this work, we present a novel GPU run time scheduler for multi-task GPU computations that transparently provides asynchronous execution, space-sharing, and transfer-computation overlap without requiring in advance any information about the program dependency structure. We leverage the GrCUDA polyglot API to integrate our scheduler with multiple high-level languages and provide a platform for fast prototyping and easy GPU acceleration. We validate our work on 6 benchmarks created to evaluate task-parallelism and show an average of 44% speedup against synchronous execution, with no execution time slowdown compared to hand-optimized host code written using the C++ CUDA Graphs API.

下载PDF全文

下载文献需遵守相关版权规定

论文标题