在协作边缘计算中的分布式CNN推理加速度的基于现场的细分

论文标题

在协作边缘计算中的分布式CNN推理加速度的基于现场的细分

Receptive Field-based Segmentation for Distributed CNN Inference Acceleration in Collaborative Edge Computing

论文作者

Li, Nan, Iosifidis, Alexandros, Zhang, Qi

论文摘要

本文在协作边缘计算网络中使用分布式卷积神经网络（CNN）研究推断加速度。为了避免推理任务分配中的推断准确性丧失，我们提出了基于田间的分割（RFS）。为了减少计算时间和通信开销，我们提出了一种使用融合层并行化的新型协作边缘计算，以将CNN模型划分为多个卷积层的块。在此方案中，协作边缘服务器（ESS）只需要在计算每个融合块后交换子输出的一小部分。此外，为了找到将CNN模型划分为多个块的最佳解决方案，我们使用动态编程，称为融合层并行化（DPFP）的动态编程。实验结果表明，与预训练的模型相比，DPFP可以加速VGG-16的推理，最高73％，在所有测试的情况下，这表现优于现有工作模型。此外，我们评估了DPFP在时变频道下的服务可靠性，这表明DPFP是通过严格的服务截止日期确保高服务可靠性的有效解决方案。

This paper studies inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network. To avoid inference accuracy loss in inference task partitioning, we propose receptive field-based segmentation (RFS). To reduce the computation time and communication overhead, we propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers. In this scheme, the collaborative edge servers (ESs) only need to exchange small fraction of the sub-outputs after computing each fused block. In addition, to find the optimal solution of partitioning a CNN model into multiple blocks, we use dynamic programming, named as dynamic programming for fused-layer parallelization (DPFP). The experimental results show that DPFP can accelerate inference of VGG-16 up to 73% compared with the pre-trained model, which outperforms the existing work MoDNN in all tested scenarios. Moreover, we evaluate the service reliability of DPFP under time-variant channel, which shows that DPFP is an effective solution to ensure high service reliability with strict service deadline.

下载PDF全文

下载文献需遵守相关版权规定

论文标题