论文标题
分布式学习和与压缩图像的推断
Distributed Learning and Inference with Compressed Images
论文作者
论文摘要
现代计算机视觉需要在训练模型和/或推断期间进行大量数据处理,一旦模型被部署。在物理分离的位置捕获和处理图像的方案越来越普遍(例如,自动驾驶汽车,云计算)。此外,许多设备的资源有限以存储或传输数据(例如,存储空间,通道容量)。在这些情况下,有损图像压缩起着至关重要的作用,可以有效地增加此类约束下收集的图像数量。但是,有损耗的压缩需要对数据的一些不希望的退化,这可能会损害手头下游分析任务的性能,因为在此过程中可能会丢失重要的语义信息。此外,我们可能只在训练时间有压缩图像,但能够在推理时使用原始图像,反之亦然,在这种情况下,下游模型遭受了协变量的偏移。在本文中,我们分析了这一现象,特别关注基于视觉的感知,以自主驾驶作为一种范式的情况。我们看到,语义信息和协变量转移的丧失确实存在,从而导致性能下降,这取决于压缩率。为了解决该问题,我们根据具有生成对抗网络(GAN)的图像恢复提出数据集修复。我们的方法对特定图像压缩方法和下游任务都是不可知的。并且具有不增加部署模型的额外成本的优势,这在资源有限的设备中尤为重要。提出的实验集中于语义分割作为一种具有挑战性的用例,涵盖了广泛的压缩率和不同的数据集,并显示我们的方法如何显着减轻压缩对下游视觉任务的负面影响。
Modern computer vision requires processing large amounts of data, both while training the model and/or during inference, once the model is deployed. Scenarios where images are captured and processed in physically separated locations are increasingly common (e.g. autonomous vehicles, cloud computing). In addition, many devices suffer from limited resources to store or transmit data (e.g. storage space, channel capacity). In these scenarios, lossy image compression plays a crucial role to effectively increase the number of images collected under such constraints. However, lossy compression entails some undesired degradation of the data that may harm the performance of the downstream analysis task at hand, since important semantic information may be lost in the process. Moreover, we may only have compressed images at training time but are able to use original images at inference time, or vice versa, and in such a case, the downstream model suffers from covariate shift. In this paper, we analyze this phenomenon, with a special focus on vision-based perception for autonomous driving as a paradigmatic scenario. We see that loss of semantic information and covariate shift do indeed exist, resulting in a drop in performance that depends on the compression rate. In order to address the problem, we propose dataset restoration, based on image restoration with generative adversarial networks (GANs). Our method is agnostic to both the particular image compression method and the downstream task; and has the advantage of not adding additional cost to the deployed models, which is particularly important in resource-limited devices. The presented experiments focus on semantic segmentation as a challenging use case, cover a broad range of compression rates and diverse datasets, and show how our method is able to significantly alleviate the negative effects of compression on the downstream visual task.