机器学习系统设计的张量关系代数

论文标题

机器学习系统设计的张量关系代数

Tensor Relational Algebra for Machine Learning System Design

论文作者

Yuan, Binhang, Jankov, Dimitrije, Zou, Jia, Tang, Yuxin, Bourgeois, Daniel, Jermaine, Chris

论文摘要

我们考虑一个问题：机器学习系统的计算引擎应该实现什么抽象？当前的机器学习系统通常通过一系列计算内核（例如矩阵乘法或激活功能）推动整个张量，其中每个内核在AI加速器（ASIC）（例如GPU）上运行。该实现抽象几乎没有为ML系统提供的内置支持，即可超越一台机器，或者用矩阵或张量不容易适合ASIC的RAM的大型型号。在本文中，我们提出了一个替代实施抽象，称为张量关系代数（TRA）。 TRA是基于关系代数的基于集合的代数。 TRA中的表达式通过二进制张量关系运行，其中键是多维阵列和值是张量。在平行或分布式环境中，可以轻松执行TRA，并且可以自动优化。我们的实证研究表明，优化的基于TRA的后端可以显着优于在分布式簇中运行ML工作流程的替代方案。

We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC. In this paper, we present an alternative implementation abstraction called the tensor relational algebra (TRA). The TRA is a set-based algebra based on the relational algebra. Expressions in the TRA operate over binary tensor relations, where keys are multi-dimensional arrays and values are tensors. The TRA is easily executed with high efficiency in a parallel or distributed environment, and amenable to automatic optimization. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML workflows in distributed clusters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题