论文标题
DV-ARPA:用于累积应用程序中的大数据处理的数据品种意识到资源提供
DV-ARPA: Data Variety Aware Resource Provisioning for Big Data Processing in Accumulative Applications
论文作者
论文摘要
在云计算中,所使用的资源供应方法对处理成本有很大的影响,尤其是当它用于大数据处理时。由于数据的变化,虚拟机(VM)的性能可能会根据数据块的内容而有所不同。符合数据的分配会导致VM的性能降低,并增加处理成本。因此,可以通过将VM与给定数据块匹配来降低作业的总成本。我们使用数据差异感知的资源分配方法来降低所考虑的工作的处理成本。对于此问题,我们将输入数据分为某些数据块。我们定义了每个数据块的重要性,并根据其中选择适当的VM来降低成本。为了检测每个数据部分的重要性,我们使用特定的采样方法。这种方法适用于累积应用。我们使用一些众所周知的基准和配置服务器进行评估。根据结果,与其他方法相比,我们的配置方法提高了处理成本高达35%。
In Cloud Computing, the resource provisioning approach used has a great impact on the processing cost, especially when it is used for Big Data processing. Due to data variety, the performance of virtual machines (VM) may differ based on the contents of the data blocks. Data variety-oblivious allocation causes a reduction in the performance of VMs and increases the processing cost. Thus, it is possible to reduce the total cost of the job by matching the VMs with the given data blocks. We use a data-variety-aware resource allocation approach to reduce the processing cost of the considered job. For this issue, we divide the input data into some data blocks. We define the significance of each data block and based on it we choose the appropriate VMs to reduce the cost. For detecting the significance of each data portion, we use a specific sampling method. This approach is applicable to accumulative applications. We use some well-known benchmarks and configured servers for our evaluations. Based on the results, our provisioning approach improves the processing cost, up to 35% compared to other approaches.