Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing

In data-intensive cluster computing platforms such as Hadoop YARN, performance and fairness are two important factors for system design and optimizations. Many previous studies are either for performance or for fairness solely, without considering the tradeoff between performance and fairness. Recen...

Full description

Saved in:
Bibliographic Details
Main Authors: Niu, Zhaojie, Tang, Shanjiang, He, Bingsheng
Other Authors: School of Computer Engineering
Format: Conference or Workshop Item
Language:English
Published: 2016
Subjects:
Online Access:https://hdl.handle.net/10356/80355
http://hdl.handle.net/10220/40532
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-80355
record_format dspace
spelling sg-ntu-dr.10356-803552020-05-28T07:17:47Z Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing Niu, Zhaojie Tang, Shanjiang He, Bingsheng School of Computer Engineering 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom) Optimization Processor scheduling Adaptation models Computational modeling In data-intensive cluster computing platforms such as Hadoop YARN, performance and fairness are two important factors for system design and optimizations. Many previous studies are either for performance or for fairness solely, without considering the tradeoff between performance and fairness. Recent studies observe that there is a tradeoff between performance and fairness because of resource contention between users/jobs. However, their scheduling algorithms for bi-criteria optimization between performance and fairness are static, without considering the impact of different workload characteristics on the tradeoff between performance and fairness. In this paper, we propose an adaptive scheduler called Gemini for Hadoop YARN. We first develop a model with the regression approach to estimate the performance improvement and the fairness loss under the sharing computation compared to the exclusive non-sharing scenario. Next, we leverage the model to guide the resource allocation for pending tasks to optimize the performance of the cluster given the user-defined fairness level. Instead of using a static scheduling policy, Gemini adaptively decides the proper scheduling policy according to the current running workload. We implement Gemini in Hadoop YARN. Experimental results show that Gemini outperforms the state-of-the-art approach in two aspects. 1) For the same fairness loss, Gemini improves the performance by up to 225% and 200% in real deployment and the large-scale simulation, respectively, 2) For the same performance improvement, Gemini reduces the fairness loss up to 70% and 62.5% in real deployment and the large-scale simulation, respectively. MOE (Min. of Education, S’pore) Accepted version 2016-05-12T04:40:54Z 2019-12-06T13:47:50Z 2016-05-12T04:40:54Z 2019-12-06T13:47:50Z 2015 2015 Conference Paper Niu, Z., Tang, S., & He, B. (2015). Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing. 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom), 66-73. https://hdl.handle.net/10356/80355 http://hdl.handle.net/10220/40532 10.1109/CloudCom.2015.52 191998 en © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/CloudCom.2015.52]. 8 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Optimization
Processor scheduling
Adaptation models
Computational modeling
spellingShingle Optimization
Processor scheduling
Adaptation models
Computational modeling
Niu, Zhaojie
Tang, Shanjiang
He, Bingsheng
Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing
description In data-intensive cluster computing platforms such as Hadoop YARN, performance and fairness are two important factors for system design and optimizations. Many previous studies are either for performance or for fairness solely, without considering the tradeoff between performance and fairness. Recent studies observe that there is a tradeoff between performance and fairness because of resource contention between users/jobs. However, their scheduling algorithms for bi-criteria optimization between performance and fairness are static, without considering the impact of different workload characteristics on the tradeoff between performance and fairness. In this paper, we propose an adaptive scheduler called Gemini for Hadoop YARN. We first develop a model with the regression approach to estimate the performance improvement and the fairness loss under the sharing computation compared to the exclusive non-sharing scenario. Next, we leverage the model to guide the resource allocation for pending tasks to optimize the performance of the cluster given the user-defined fairness level. Instead of using a static scheduling policy, Gemini adaptively decides the proper scheduling policy according to the current running workload. We implement Gemini in Hadoop YARN. Experimental results show that Gemini outperforms the state-of-the-art approach in two aspects. 1) For the same fairness loss, Gemini improves the performance by up to 225% and 200% in real deployment and the large-scale simulation, respectively, 2) For the same performance improvement, Gemini reduces the fairness loss up to 70% and 62.5% in real deployment and the large-scale simulation, respectively.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Niu, Zhaojie
Tang, Shanjiang
He, Bingsheng
format Conference or Workshop Item
author Niu, Zhaojie
Tang, Shanjiang
He, Bingsheng
author_sort Niu, Zhaojie
title Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing
title_short Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing
title_full Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing
title_fullStr Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing
title_full_unstemmed Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing
title_sort gemini: an adaptive performance-fairness scheduler for data-intensive cluster computing
publishDate 2016
url https://hdl.handle.net/10356/80355
http://hdl.handle.net/10220/40532
_version_ 1681059289352896512