深度神经网络（DNN）放置的建模和边缘计算中的推理

论文标题

深度神经网络（DNN）放置的建模和边缘计算中的推理

Modeling of Deep Neural Network (DNN) Placement and Inference in Edge Computing

论文作者

Bensalem, Mounir, Dizdarević, Jasenka, Jukan, Admela

论文摘要

随着边缘计算成为系统体系结构中越来越多的概念，预计当与深度学习（DL）技术结合使用时，其利用率也会增加。在物联网（IoT）和Edge设备（例如Deep Neur Network（DNN））中整合苛刻的处理算法的想法在很大程度上受益于Edge Computing硬件的开发，以及适应用于资源受限的IoT设备中的算法。令人惊讶的是，还没有模型可以最佳地放置和使用机器学习。在本文中，我们提出了第一个在边缘计算中进行最佳神经网络（DNN）放置和推断的最佳位置模型。考虑到不同模型变化的推理潜伏期，节点之间的通信潜伏期以及边缘计算节点的利用成本，我们向DNN模型变体选择和放置（MVSP）问题提供了数学公式。我们通过数值评估我们的模型，并表明，对于低负载，增加模型将平均潜伏期降低了平均延迟的33％，占毫秒规模的尺度，对于高负载，平均延迟降低了21％。

With the edge computing becoming an increasingly adopted concept in system architectures, it is expected its utilization will be additionally heightened when combined with deep learning (DL) techniques. The idea behind integrating demanding processing algorithms in Internet of Things (IoT) and edge devices, such as Deep Neural Network (DNN), has in large measure benefited from the development of edge computing hardware, as well as from adapting the algorithms for use in resource constrained IoT devices. Surprisingly, there are no models yet to optimally place and use machine learning in edge computing. In this paper, we propose the first model of optimal placement of Deep Neural Network (DNN) Placement and Inference in edge computing. We present a mathematical formulation to the DNN Model Variant Selection and Placement (MVSP) problem considering the inference latency of different model-variants, communication latency between nodes, and utilization cost of edge computing nodes. We evaluate our model numerically, and show that for low load increasing model co-location decreases the average latency by 33% of millisecond-scale per request, and for high load, by 21%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题