论文标题
Spinn:神经网络对设备和云的协同渐进推断
SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud
论文作者
论文摘要
尽管在移动应用中使用了卷积神经网络(CNN)的飙升,但由于现代CNN的过度计算需求以及已部署的设备的多样性,对移动的高性能推断均匀持续了高性能。一种流行的替代方案包括将CNN处理的卸载到强大的基于云的服务器。然而,通过依靠云来产生输出,新兴任务 - 关键和高机动性应用(例如无人机避免障碍物或交互式应用)可能会遭受动态连通性条件和云的不确定性的影响。在本文中,我们提出了Spinn,这是一种分布式推理系统,该系统采用协同的设备云计算以及渐进推理方法,以在各种环境中提供快速且强大的CNN推理。拟议的系统介绍了一个新颖的调度程序,该调度程序在运行时将早期淘汰策略和CNN分割,以适应动态条件并满足用户定义的服务级别的要求。定量评估表明,Spinn优于其最先进的合作推理对应物,在不同的网络条件下达到的吞吐量高达2倍,将服务器的成本降低了6.8倍,在延迟限制下的准确性将20.7%提高到20.7%,同时在不确定的连接条件下与众多的能源节省相比,与云中相比,可实现稳健的操作。
Despite the soaring use of convolutional neural networks (CNNs) in mobile applications, uniformly sustaining high-performance inference on mobile has been elusive due to the excessive computational demands of modern CNNs and the increasing diversity of deployed devices. A popular alternative comprises offloading CNN processing to powerful cloud-based servers. Nevertheless, by relying on the cloud to produce outputs, emerging mission-critical and high-mobility applications, such as drone obstacle avoidance or interactive applications, can suffer from the dynamic connectivity conditions and the uncertain availability of the cloud. In this paper, we propose SPINN, a distributed inference system that employs synergistic device-cloud computation together with a progressive inference method to deliver fast and robust CNN inference across diverse settings. The proposed system introduces a novel scheduler that co-optimises the early-exit policy and the CNN splitting at run time, in order to adapt to dynamic conditions and meet user-defined service-level requirements. Quantitative evaluation illustrates that SPINN outperforms its state-of-the-art collaborative inference counterparts by up to 2x in achieved throughput under varying network conditions, reduces the server cost by up to 6.8x and improves accuracy by 20.7% under latency constraints, while providing robust operation under uncertain connectivity conditions and significant energy savings compared to cloud-centric execution.