论文标题

Phoebe:通过预期动态工作负载,QoS-Aware分布式流处理

Phoebe: QoS-Aware Distributed Stream Processing through Anticipating Dynamic Workloads

论文作者

Geldenhuys, Morgan K., Scheinert, Dominik, Kao, Odej, Thamsen, Lauritz

论文摘要

分布式流处理系统已成为大数据处理平台的重要组成部分。它们的特征是接近实时事件流的高通量处理,目的是提供低延迟结果,从而实现时间敏感的决策。同时,即使在存在部分失败的情况下,确切的一开始处理保证才能正确性,预计结果也是一致的。流处理工作负载本质上通常是动态的,随着时间的流逝,静态配置高效。静态资源分配几乎可以肯定会对服务质量产生负面影响和/或导致更高的运营成本。 在本文中,我们介绍了菲比(Phoebe),这是一种用于在动态工作负载上执行分布式流处理作业的系统自动调整的主动方法。我们的方法利用并行分析运行,QoS建模和运行时优化来提供一种通用解决方案,从而自动调整配置参数,以确保稳定的服务以及与恢复时间质量质量目标目标的对齐。菲比(Phoebe)利用时间序列预测来了解未来的工作负载要求,从而提供了准确,长寿和可靠的规模决策。我们的实验表明,Phoebe能够提供稳定的服务,同时减少资源过度提供的服务。

Distributed Stream Processing systems have become an essential part of big data processing platforms. They are characterized by the high-throughput processing of near to real-time event streams with the goal of delivering low-latency results and thus enabling time-sensitive decision making. At the same time, results are expected to be consistent even in the presence of partial failures where exactly-once processing guarantees are required for correctness. Stream processing workloads are oftentimes dynamic in nature which makes static configurations highly inefficient as time goes by. Static resource allocations will almost certainly either negatively impact upon the Quality of Service and/or result in higher operational costs. In this paper we present Phoebe, a proactive approach to system auto-tuning for Distributed Stream Processing jobs executing on dynamic workloads. Our approach makes use of parallel profiling runs, QoS modeling, and runtime optimization to provide a general solution whereby configuration parameters are automatically tuned to ensure a stable service as well as alignment with recovery time Quality of Service targets. Phoebe makes use of Time Series Forecasting to gain an insight into future workload requirements thereby delivering scaling decisions which are accurate, long-lived, and reliable. Our experiments demonstrate that Phoebe is able to deliver a stable service while at the same time reducing resource over-provisioning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源