高级编程摘要用于利用微核体系结构的层次结构记忆

论文标题

高级编程摘要用于利用微核体系结构的层次结构记忆

High level programming abstractions for leveraging hierarchical memories with micro-core architectures

论文作者

Jamieson, Maurice, Brown, Nick

论文摘要

微核体系结构将许多低内存，低功率计算核心组合在一起，将单个软件包组合在一起。这些对于用作加速器很有吸引力，但由于片上内存和多个内存层次结构的有限，因此需要仔细考虑程序员卸载内核的方式。在本文中，我们将Python用作探索更高级别编程语言的语义和抽象的工具，以支持将计算内核卸载到这些设备上的卸载。通过通过参考模型转移到通行，以及利用内存类型的方式，我们证明了在内存层次结构中轻松有效地利用多个级别的能力，即使是微核无法直接访问的能力。使用机器学习基准测试，我们对Epiphany-III和基于微闪烁的微核进行实验，证明了使用任意尺寸的数据集计算的能力。为了提供结果的背景，我们探讨了这些技术的性能和功率效率，表明，尽管这两种微核技术在其自己嵌入的硬件类别中具有竞争力，但仍有一种方法可以达到HPC类GPU。

Micro-core architectures combine many low memory, low power computing cores together in a single package. These are attractive for use as accelerators but due to limited on-chip memory and multiple levels of memory hierarchy, the way in which programmers offload kernels needs to be carefully considered. In this paper we use Python as a vehicle for exploring the semantics and abstractions of higher level programming languages to support the offloading of computational kernels to these devices. By moving to a pass by reference model, along with leveraging memory kinds, we demonstrate the ability to easily and efficiently take advantage of multiple levels in the memory hierarchy, even ones that are not directly accessible to the micro-cores. Using a machine learning benchmark, we perform experiments on both Epiphany-III and MicroBlaze based micro-cores, demonstrating the ability to compute with data sets of arbitrarily large size. To provide context of our results, we explore the performance and power efficiency of these technologies, demonstrating that whilst these two micro-core technologies are competitive within their own embedded class of hardware, there is still a way to go to reach HPC class GPUs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题