动态拆分计算，以实现有效的深边缘智能

论文标题

动态拆分计算，以实现有效的深边缘智能

Dynamic Split Computing for Efficient Deep Edge Intelligence

论文作者

Bakhtiarnia, Arian, Milošević, Nemanja, Zhang, Qi, Bajović, Dragana, Iosifidis, Alexandros

论文摘要

由于其计算资源有限，在物联网和移动设备上部署深层神经网络（DNN）是一项艰巨的任务。因此，苛刻的任务通常完全被卸载到可以加速推理的边缘服务器上，但是，这也会导致沟通成本并唤起隐私问题。此外，这种方法使端设备的计算能力未使用。拆分计算是一个范式，其中DNN分为两个部分。第一部分是在终点设备上执行的，并且输出将传输到执行最终部分的边缘服务器。在这里，我们介绍动态拆分计算，其中最佳拆分位置是根据通信通道的状态动态选择的。通过使用现代DNN体系结构中已经存在的天然瓶颈，动态拆分计算避免了再培训和超参数优化，并且对DNN的最终准确性没有任何负面影响。通过广泛的实验，我们表明动态拆分计算在边缘计算环境中的推断速度更快，而数据速率和服务器负载随时间变化。

Deploying deep neural networks (DNNs) on IoT and mobile devices is a challenging task due to their limited computational resources. Thus, demanding tasks are often entirely offloaded to edge servers which can accelerate inference, however, it also causes communication cost and evokes privacy concerns. In addition, this approach leaves the computational capacity of end devices unused. Split computing is a paradigm where a DNN is split into two sections; the first section is executed on the end device, and the output is transmitted to the edge server where the final section is executed. Here, we introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel. By using natural bottlenecks that already exist in modern DNN architectures, dynamic split computing avoids retraining and hyperparameter optimization, and does not have any negative impact on the final accuracy of DNNs. Through extensive experiments, we show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题