Simplified and effective resource provisioning for scientific workflows in IaaS clouds

Cloud computing has become a popular computing platform for many scientific applications from various research fields. The workflow model is widely used by scientists to manage and analyze those large-scale scientific applications. Due to the pay-as-you-go pricing scheme, resource provisioning for s...

Full description

Saved in:
Bibliographic Details
Main Author: Zhou, Chi
Other Authors: He Bingsheng
Format: Theses and Dissertations
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/66076
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-66076
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computer systems organization
spellingShingle DRNTU::Engineering::Computer science and engineering::Computer systems organization
Zhou, Chi
Simplified and effective resource provisioning for scientific workflows in IaaS clouds
description Cloud computing has become a popular computing platform for many scientific applications from various research fields. The workflow model is widely used by scientists to manage and analyze those large-scale scientific applications. Due to the pay-as-you-go pricing scheme, resource provisioning for scientific workflows in Infrastructure-as-a-service (IaaS) clouds is an important and complicated problem for cost and performance optimizations of workflows. The complexities come from severe cloud performance and price dynamics and various user requirements on performance and cost. IaaS cloud environment is dynamic, with performance dynamics caused by the interferences from concurrent executions and price dynamics like spot prices offered by Amazon EC2. However, existing studies of resource provisioning of workflows are not aware of the cloud dynamics, and assume static workflow execution time in the cloud. IaaS clouds usually offer different types of resources (e.g., virtual machines and storage) and workflow owners can have different optimization requirements on performance and cost. However, we find that most existing studies adopt ad hoc optimization strategies rather than a systematic approach. For different optimization problems, specific heuristics are designed for the specified goals and constraints and are not flexible enough for the various and evolving user requirements. To address the above issues, we propose three projects to achieve flexible and effective optimizations for resource provisioning of scientific workflows in IaaS clouds. Specifically, we propose a probabilistic scheduling system called Dyna to effectively optimize the monetary cost of workflows in the cloud. Dyna adopts a probabilistic QoS notion to explicitly expose the performance and cost dynamics of IaaS clouds to users. We develop an A⋆-based hybrid instance configuration method to reduce the expected monetary cost of workflows while satisfying user-specified probabilistic deadline guarantees. To simplify the complexities in cloud offerings and user requirements, we propose a transformation-based optimization framework named ToF. ToF abstracts the common performance and monetary cost optimizations as transformations and formulates six basic workflow transformation operations. An arbitrary performance and cost optimization process can be represented as a transformation plan (i.e., a sequence of basic transformation operations). All transformations form a huge optimization space. We further develop a cost model guided planner to efficiently find the optimized transformation for a predefined goal (e.g., minimizing the monetary cost with a given performance requirement). Based on the above two projects, we propose a declarative optimization engine called Deco, which considers both the cloud dynamics and flexibility of optimizations. Deco embraces a cloud- and workflow-specific declarative language for users to specify various workflow optimization problems. The declarative language is in support of the probabilistic QoS notion. We further propose a probabilistic optimization approach for evaluating the declarative optimization goals and constraints in the cloud. To accelerate the solution finding, we leverage the parallelism of GPUs to find the solution in a fast and timely manner. We integrate our systems into a popular workflow management system named Pegasus. Experimental results with real-world scientific workflow applications on Amazon EC2 and a cloud simulator demonstrate that (1) the cloud dynamics greatly affect the monetary cost and performance optimization results of scientific workflows; (2) our declarative language is expressive to describe a wide class of optimization problems for scientific workflows; (3) our systems are able to optimize the monetary cost and performance goals while satisfying probabilistic QoS constraints.
author2 He Bingsheng
author_facet He Bingsheng
Zhou, Chi
format Theses and Dissertations
author Zhou, Chi
author_sort Zhou, Chi
title Simplified and effective resource provisioning for scientific workflows in IaaS clouds
title_short Simplified and effective resource provisioning for scientific workflows in IaaS clouds
title_full Simplified and effective resource provisioning for scientific workflows in IaaS clouds
title_fullStr Simplified and effective resource provisioning for scientific workflows in IaaS clouds
title_full_unstemmed Simplified and effective resource provisioning for scientific workflows in IaaS clouds
title_sort simplified and effective resource provisioning for scientific workflows in iaas clouds
publishDate 2016
url http://hdl.handle.net/10356/66076
_version_ 1759853528521637888
spelling sg-ntu-dr.10356-660762023-03-04T00:34:12Z Simplified and effective resource provisioning for scientific workflows in IaaS clouds Zhou, Chi He Bingsheng School of Computer Engineering Parallel and Distributed Computing Centre DRNTU::Engineering::Computer science and engineering::Computer systems organization Cloud computing has become a popular computing platform for many scientific applications from various research fields. The workflow model is widely used by scientists to manage and analyze those large-scale scientific applications. Due to the pay-as-you-go pricing scheme, resource provisioning for scientific workflows in Infrastructure-as-a-service (IaaS) clouds is an important and complicated problem for cost and performance optimizations of workflows. The complexities come from severe cloud performance and price dynamics and various user requirements on performance and cost. IaaS cloud environment is dynamic, with performance dynamics caused by the interferences from concurrent executions and price dynamics like spot prices offered by Amazon EC2. However, existing studies of resource provisioning of workflows are not aware of the cloud dynamics, and assume static workflow execution time in the cloud. IaaS clouds usually offer different types of resources (e.g., virtual machines and storage) and workflow owners can have different optimization requirements on performance and cost. However, we find that most existing studies adopt ad hoc optimization strategies rather than a systematic approach. For different optimization problems, specific heuristics are designed for the specified goals and constraints and are not flexible enough for the various and evolving user requirements. To address the above issues, we propose three projects to achieve flexible and effective optimizations for resource provisioning of scientific workflows in IaaS clouds. Specifically, we propose a probabilistic scheduling system called Dyna to effectively optimize the monetary cost of workflows in the cloud. Dyna adopts a probabilistic QoS notion to explicitly expose the performance and cost dynamics of IaaS clouds to users. We develop an A⋆-based hybrid instance configuration method to reduce the expected monetary cost of workflows while satisfying user-specified probabilistic deadline guarantees. To simplify the complexities in cloud offerings and user requirements, we propose a transformation-based optimization framework named ToF. ToF abstracts the common performance and monetary cost optimizations as transformations and formulates six basic workflow transformation operations. An arbitrary performance and cost optimization process can be represented as a transformation plan (i.e., a sequence of basic transformation operations). All transformations form a huge optimization space. We further develop a cost model guided planner to efficiently find the optimized transformation for a predefined goal (e.g., minimizing the monetary cost with a given performance requirement). Based on the above two projects, we propose a declarative optimization engine called Deco, which considers both the cloud dynamics and flexibility of optimizations. Deco embraces a cloud- and workflow-specific declarative language for users to specify various workflow optimization problems. The declarative language is in support of the probabilistic QoS notion. We further propose a probabilistic optimization approach for evaluating the declarative optimization goals and constraints in the cloud. To accelerate the solution finding, we leverage the parallelism of GPUs to find the solution in a fast and timely manner. We integrate our systems into a popular workflow management system named Pegasus. Experimental results with real-world scientific workflow applications on Amazon EC2 and a cloud simulator demonstrate that (1) the cloud dynamics greatly affect the monetary cost and performance optimization results of scientific workflows; (2) our declarative language is expressive to describe a wide class of optimization problems for scientific workflows; (3) our systems are able to optimize the monetary cost and performance goals while satisfying probabilistic QoS constraints. Doctor of Philosophy 2016-03-09T01:56:16Z 2016-03-09T01:56:16Z 2016 Thesis http://hdl.handle.net/10356/66076 en 147 p. application/pdf