论文标题
在边缘设备上实现深度学习
Enabling Deep Learning on Edge Devices
论文作者
论文摘要
深度神经网络(DNN)在许多不同的感知任务中取得了成功,例如计算机视觉,自然语言处理,强化学习等。高性能的DNN在很大程度上依赖于密集的资源消耗。例如,训练DNN需要高动态内存,大规模数据集和大量计算(长期训练时间);即使推断DNN也需要大量的静态存储,计算(很长的推理时间)和能量。因此,最新的DNN通常被部署在具有大量超级计算机的云服务器上,高带宽通信总线,共享存储基础架构和高功率补充。 最近,一些新的新兴智能应用程序,例如AR/VR,移动助手,物联网,要求我们在资源受限的边缘设备上部署DNN。与云服务器相比,边缘设备通常具有少量的资源。要在边缘设备上部署DNN,我们需要减少DNN的大小,即,我们以资源消耗和模型准确性之间的更好权衡。 在本文中,我们研究了四个边缘智能方案,即对边缘设备的推断,边缘设备上的适应,在边缘设备上进行学习以及边缘服务器系统,并开发了不同的方法,以在每种情况下启用深度学习。由于当前的DNN通常被过度参数化,因此我们的目标是在每种情况下查找和减少DNN的冗余。
Deep neural networks (DNNs) have succeeded in many different perception tasks, e.g., computer vision, natural language processing, reinforcement learning, etc. The high-performed DNNs heavily rely on intensive resource consumption. For example, training a DNN requires high dynamic memory, a large-scale dataset, and a large number of computations (a long training time); even inference with a DNN also demands a large amount of static storage, computations (a long inference time), and energy. Therefore, state-of-the-art DNNs are often deployed on a cloud server with a large number of super-computers, a high-bandwidth communication bus, a shared storage infrastructure, and a high power supplement. Recently, some new emerging intelligent applications, e.g., AR/VR, mobile assistants, Internet of Things, require us to deploy DNNs on resource-constrained edge devices. Compare to a cloud server, edge devices often have a rather small amount of resources. To deploy DNNs on edge devices, we need to reduce the size of DNNs, i.e., we target a better trade-off between resource consumption and model accuracy. In this dissertation, we studied four edge intelligence scenarios, i.e., Inference on Edge Devices, Adaptation on Edge Devices, Learning on Edge Devices, and Edge-Server Systems, and developed different methodologies to enable deep learning in each scenario. Since current DNNs are often over-parameterized, our goal is to find and reduce the redundancy of the DNNs in each scenario.