Cost-effective and Qos-aware resource allocation for cloud computing

As the most important problem in cloud computing technology, resource allocation not only affects the cost of the cloud operators and users, but also impacts the performance of cloud jobs. Provisioning too much resource in clouds wastes energy and cost while provisioning too few resource will cause...

Full description

Saved in:
Bibliographic Details
Main Author: Wei, Lei
Other Authors: Foh Chuan Heng
Format: Theses and Dissertations
Language:English
Published: 2016
Subjects:
Online Access:https://hdl.handle.net/10356/66012
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-66012
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Wei, Lei
Cost-effective and Qos-aware resource allocation for cloud computing
description As the most important problem in cloud computing technology, resource allocation not only affects the cost of the cloud operators and users, but also impacts the performance of cloud jobs. Provisioning too much resource in clouds wastes energy and cost while provisioning too few resource will cause performance degradation of cloud applications. Current researches in the resource allocation field mainly focus on homogeneous resource allocation and take CPU as the most important resource in resource allocation. However, as resource demands of cloud workloads get increasingly heterogeneous on different resource types, current methods are not suitable for some other type of jobs such as memory-intensive applications. They are neither efficient in terms of offering economical and high-quality resource allocation in clouds. In this thesis, we firstly propose a resource provisioning method, namely BigMem, to consider the features of resource allocation based on memory. Memory-intensive applications have recently become popular for high-throughput and low-latency computing. Current resource provisioning methods focus more on other resources such as CPU and network bandwidth which are considered as the bottlenecks in traditional cloud applications. However, for memory-intensive jobs, main memories are always the bottleneck resource for performance. Therefore, main memory should be the first consideration in resource allocation and provisioning for VMs in clouds hosting memory-intensive applications. By considering the unique behavior of resource provisioning for memory-intensive jobs, BigMem is able to effectively reduce the resource usage for dynamic workloads in clouds. Specifically, we seek Markov Chain modeling to periodically determine the required number of PMs and further optimize the resource utilization by conducting VM migration and resource overcommit. We evaluate our design using simulation with synthetic and real world traces. Experiments results show that BigMem is able to provision the appropriate number of resources for highly dynamic workloads while keeping an acceptable service-level-agreement (SLA). By comparisons, BigMem reduces the average number of active machines in data center by 63\% and 27\% on average than peakload provisioning and heuristic methods, respectively. These results translate into good performance for users and low cost for cloud providers. To support different types of workloads in clouds (such as memory-intensive and computation-intensive applications), we then propose a heterogeneous resource allocation method, skewness-avoidance multi-resource allocation (SAMR), that considers the skewness of different resource types to optimize the resource usage in clouds. Current IaaS clouds provision resources in terms of virtual machines (VMs) with homogeneous resource configurations where different types of resources in VMs have similar share of the capacity in a physical machine (PM). However, most user jobs demand different amounts for different resources. For instance, highperformance-computing jobs require more CPU cores while memory-intensive applications require more memory. The existing homogeneous resource allocation mechanisms cause resource starvation where dominant resources are starved while non-dominant resources are wasted. To overcome this issue, we propose SAMR to allocate resource according to diversified requirements on different types of resources. Our solution includes a job allocation algorithm to ensure heterogeneous workloads are allocated appropriately to avoid skewed resource utilization in PMs, and a model-based approach to estimate the appropriate number of active PMs to operate SAMR. We show relatively low complexity for our model-based approach for practical operation and accurate estimation. Extensive simulation results show the effectiveness of SAMR and the performance advantages over its counterparts. Finally, we turn to a resource allocation problem in a specific application for media computing in clouds. As the ``biggest big data", video data streaming in the network contributes the largest portion of global traffic nowadays and in future. Due to heterogeneous mobile devices, networks and user preferences, the demands of transcoding source videos into different versions have been increased significantly. However, video transcoding is a time-consuming task and how to guarantee qualityof-service (QoS) for large video data is very challenging, particularly for those realtime applications which hold strict delay requirement. In this thesis, we propose a cloud-based online video transcoding system (COVT) aiming to offer economical and QoS guaranteed solution for online large-volume video transcoding. COVT utilizes performance profiling technique to obtain different performance of transcoding tasks in different infrastructures. Based on the profiles, we model the cloud-based transcoding system as a queue and derive the QoS values of the system based on queuing theory. With the analytically derived relationship between QoS values and the number of CPU cores required for transcoding workloads, COVT is able to solve the optimization problem and obtain the minimum resource reservation for specific QoS constraints. A task scheduling algorithm is further developed to dynamically adjust the resource reservation and schedule the tasks so as to guarantee the QoS. We implement a prototype system of COVT and experimentally study the performance on real-world workloads. Experimental results show that COVT effectively provisions minimum number of resources for predefined QoS. To validate the effectiveness of our proposed method under large scale video data, we further perform simulation evaluation which again shows that COVT is capable to achieve cost-effective and QoS-aware video transcoding in cloud environment.
author2 Foh Chuan Heng
author_facet Foh Chuan Heng
Wei, Lei
format Theses and Dissertations
author Wei, Lei
author_sort Wei, Lei
title Cost-effective and Qos-aware resource allocation for cloud computing
title_short Cost-effective and Qos-aware resource allocation for cloud computing
title_full Cost-effective and Qos-aware resource allocation for cloud computing
title_fullStr Cost-effective and Qos-aware resource allocation for cloud computing
title_full_unstemmed Cost-effective and Qos-aware resource allocation for cloud computing
title_sort cost-effective and qos-aware resource allocation for cloud computing
publishDate 2016
url https://hdl.handle.net/10356/66012
_version_ 1759857224194195456
spelling sg-ntu-dr.10356-660122023-03-04T00:43:22Z Cost-effective and Qos-aware resource allocation for cloud computing Wei, Lei Foh Chuan Heng He Bingsheng Cai Jianfei School of Computer Engineering Centre for Multimedia and Network Technology DRNTU::Engineering::Computer science and engineering As the most important problem in cloud computing technology, resource allocation not only affects the cost of the cloud operators and users, but also impacts the performance of cloud jobs. Provisioning too much resource in clouds wastes energy and cost while provisioning too few resource will cause performance degradation of cloud applications. Current researches in the resource allocation field mainly focus on homogeneous resource allocation and take CPU as the most important resource in resource allocation. However, as resource demands of cloud workloads get increasingly heterogeneous on different resource types, current methods are not suitable for some other type of jobs such as memory-intensive applications. They are neither efficient in terms of offering economical and high-quality resource allocation in clouds. In this thesis, we firstly propose a resource provisioning method, namely BigMem, to consider the features of resource allocation based on memory. Memory-intensive applications have recently become popular for high-throughput and low-latency computing. Current resource provisioning methods focus more on other resources such as CPU and network bandwidth which are considered as the bottlenecks in traditional cloud applications. However, for memory-intensive jobs, main memories are always the bottleneck resource for performance. Therefore, main memory should be the first consideration in resource allocation and provisioning for VMs in clouds hosting memory-intensive applications. By considering the unique behavior of resource provisioning for memory-intensive jobs, BigMem is able to effectively reduce the resource usage for dynamic workloads in clouds. Specifically, we seek Markov Chain modeling to periodically determine the required number of PMs and further optimize the resource utilization by conducting VM migration and resource overcommit. We evaluate our design using simulation with synthetic and real world traces. Experiments results show that BigMem is able to provision the appropriate number of resources for highly dynamic workloads while keeping an acceptable service-level-agreement (SLA). By comparisons, BigMem reduces the average number of active machines in data center by 63\% and 27\% on average than peakload provisioning and heuristic methods, respectively. These results translate into good performance for users and low cost for cloud providers. To support different types of workloads in clouds (such as memory-intensive and computation-intensive applications), we then propose a heterogeneous resource allocation method, skewness-avoidance multi-resource allocation (SAMR), that considers the skewness of different resource types to optimize the resource usage in clouds. Current IaaS clouds provision resources in terms of virtual machines (VMs) with homogeneous resource configurations where different types of resources in VMs have similar share of the capacity in a physical machine (PM). However, most user jobs demand different amounts for different resources. For instance, highperformance-computing jobs require more CPU cores while memory-intensive applications require more memory. The existing homogeneous resource allocation mechanisms cause resource starvation where dominant resources are starved while non-dominant resources are wasted. To overcome this issue, we propose SAMR to allocate resource according to diversified requirements on different types of resources. Our solution includes a job allocation algorithm to ensure heterogeneous workloads are allocated appropriately to avoid skewed resource utilization in PMs, and a model-based approach to estimate the appropriate number of active PMs to operate SAMR. We show relatively low complexity for our model-based approach for practical operation and accurate estimation. Extensive simulation results show the effectiveness of SAMR and the performance advantages over its counterparts. Finally, we turn to a resource allocation problem in a specific application for media computing in clouds. As the ``biggest big data", video data streaming in the network contributes the largest portion of global traffic nowadays and in future. Due to heterogeneous mobile devices, networks and user preferences, the demands of transcoding source videos into different versions have been increased significantly. However, video transcoding is a time-consuming task and how to guarantee qualityof-service (QoS) for large video data is very challenging, particularly for those realtime applications which hold strict delay requirement. In this thesis, we propose a cloud-based online video transcoding system (COVT) aiming to offer economical and QoS guaranteed solution for online large-volume video transcoding. COVT utilizes performance profiling technique to obtain different performance of transcoding tasks in different infrastructures. Based on the profiles, we model the cloud-based transcoding system as a queue and derive the QoS values of the system based on queuing theory. With the analytically derived relationship between QoS values and the number of CPU cores required for transcoding workloads, COVT is able to solve the optimization problem and obtain the minimum resource reservation for specific QoS constraints. A task scheduling algorithm is further developed to dynamically adjust the resource reservation and schedule the tasks so as to guarantee the QoS. We implement a prototype system of COVT and experimentally study the performance on real-world workloads. Experimental results show that COVT effectively provisions minimum number of resources for predefined QoS. To validate the effectiveness of our proposed method under large scale video data, we further perform simulation evaluation which again shows that COVT is capable to achieve cost-effective and QoS-aware video transcoding in cloud environment. DOCTOR OF PHILOSOPHY (SCE) 2016-02-29T02:36:56Z 2016-02-29T02:36:56Z 2016 Thesis Wei, L. (2016). Cost-effective and Qos-aware resource allocation for cloud computing. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/66012 10.32657/10356/66012 en 125 p. application/pdf