深度学习推论框架基准

论文标题

深度学习推论框架基准

Deep Learning Inference Frameworks Benchmark

论文作者

Pochelu, Pierrick

论文摘要

去年，深度学习（DL）已被广泛采用，但它们是计算密集型方法。因此，科学家提出了多种优化，以加快其对最终用户应用的预测。但是，目前尚无单一推论框架在性能方面占主导地位。本文采用了一种整体方法来对四个代表性DL推理框架进行经验比较和分析。首先，如果选择了CPU-GPU配置，我们表明，对于特定的DL框架，其设置的不同配置可能会对预测速度，内存和计算能力产生重大影响。其次，据我们所知，这项研究是第一个确定在同一GPU中加速共定位模型合奏的机会的机会。这项测量研究提供了对四个代表性DL框架的深入经验比较和分析，并为服务提供商提供了实用的指导，以部署和提供DL预测。

Deep learning (DL) has been widely adopted those last years but they are computing-intensive method. Therefore, scientists proposed diverse optimization to accelerate their predictions for end-user applications. However, no single inference framework currently dominates in terms of performance. This paper takes a holistic approach to conduct an empirical comparison and analysis of four representative DL inference frameworks. First, given a selection of CPU-GPU configurations, we show that for a specific DL framework, different configurations of its settings may have a significant impact on the prediction speed, memory, and computing power. Second, to the best of our knowledge, this study is the first to identify the opportunities for accelerating the ensemble of co-localized models in the same GPU. This measurement study provides an in-depth empirical comparison and analysis of four representative DL frameworks and offers practical guidance for service providers to deploy and deliver DL predictions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题