Wasserstein对抗变压器用于云工作负载预测

论文标题

Wasserstein对抗变压器用于云工作负载预测

Wasserstein Adversarial Transformer for Cloud Workload Prediction

论文作者

Arbat, Shivani, Jayakumar, Vinodh Kumaran, Lee, Jaewoo, Wang, Wei, Kim, In Kee

论文摘要

预测虚拟机（VM）自动缩放是一种优化云应用程序运营成本和性能的有前途技术。了解该职位到达率对于准确预测云工作负载的未来变化以及主动为托管应用程序进行供应和脱机VM至关重要。但是，由于云工作负载的动态性质，开发一个准确预测云工作负载变化的模型极具挑战性。已经开发了长期 - 式内存（LSTM）模型来进行云工作负载预测。不幸的是，最先进的LSTM模型利用复发进行预测，这自然会增加复杂性并随着输入序列的增长而增加推理开销。为了开发具有高精度和低推理开销的云工作负载预测模型，这项工作提出了一种新型的时间序列预测模型，称为WGAN-GP变压器，灵感来自变压器网络和改进的Wasserstein-Gans。所提出的方法采用变压器网络作为生成器，而多层感知器作为评论家。使用现实世界的工作负载痕迹进行的广泛评估表明，WGAN-GP变压器的推理时间更快5倍，针对最新方法的预测准确性高达5.1％。我们还将WGAN-GP变压器应用于Google Cloud Platform上的自动尺度机制，而基于WGAN-GP变压器的自动尺度机制通过显着降低VM过度提供和提供不足的速率来优于基于LSTM的机制。

Predictive Virtual Machine (VM) auto-scaling is a promising technique to optimize cloud applications operating costs and performance. Understanding the job arrival rate is crucial for accurately predicting future changes in cloud workloads and proactively provisioning and de-provisioning VMs for hosting the applications. However, developing a model that accurately predicts cloud workload changes is extremely challenging due to the dynamic nature of cloud workloads. Long-Short-Term-Memory (LSTM) models have been developed for cloud workload prediction. Unfortunately, the state-of-the-art LSTM model leverages recurrences to predict, which naturally adds complexity and increases the inference overhead as input sequences grow longer. To develop a cloud workload prediction model with high accuracy and low inference overhead, this work presents a novel time-series forecasting model called WGAN-gp Transformer, inspired by the Transformer network and improved Wasserstein-GANs. The proposed method adopts a Transformer network as a generator and a multi-layer perceptron as a critic. The extensive evaluations with real-world workload traces show WGAN-gp Transformer achieves 5 times faster inference time with up to 5.1 percent higher prediction accuracy against the state-of-the-art approach. We also apply WGAN-gp Transformer to auto-scaling mechanisms on Google cloud platforms, and the WGAN-gp Transformer-based auto-scaling mechanism outperforms the LSTM-based mechanism by significantly reducing VM over-provisioning and under-provisioning rates.

下载PDF全文

下载文献需遵守相关版权规定

论文标题