Epsilon, a cluster scheduler for Kubernetes clusters
The adoption of shared computer clusters for executing high-performance computing workloads has allowed many organizations with limited financial capabilities to access computing power that otherwise might be too costly for them to build. However, to support these heterogeneous workloads, cluster sc...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/147629 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-147629 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1476292021-04-08T13:08:13Z Epsilon, a cluster scheduler for Kubernetes clusters Neo, Alex Jing Hui Lee Bu Sung, Francis School of Computer Science and Engineering Singapore Advanced Research and Education Network Asi@Connect Project, TEIN EBSLEE@ntu.edu.sg Engineering::Computer science and engineering::Computer applications Engineering::Computer science and engineering::Information systems The adoption of shared computer clusters for executing high-performance computing workloads has allowed many organizations with limited financial capabilities to access computing power that otherwise might be too costly for them to build. However, to support these heterogeneous workloads, cluster schedulers are getting increasingly complex to develop and maintain as features are added based on different workload requirements. Kubernetes is a container orchestration platform designed to simplify the deployment of containers in a computer cluster. Kubernetes provides a monolithic cluster scheduler, which is responsible for allocating resources to containers. The author developed Epsilon as a cluster scheduler for Kubernetes using the microservices model as a foundation. In partnership with AsiaConnect, Epsilon's goal is to act as a scheduler and a starting point for examining the viability of using microservices to create a scheduler that helps developers implement updates quicker and support for heterogeneous workloads. Being a microservices-based scheduler, Epsilon is build using multiple microservices with strict service boundaries and are kept simple in design to prevent increasing code complexity due to modifications or feature updates. Epsilon contains multiple microservices for different functionalities, with some microservices making up the core system and the remaining as supporting features. The core microservices consist of the Coordinator, Scheduler, and Queue microservices which are responsible for the monitoring of new pods, scheduling of new pods, and communication between microservices respectively. Support microservices include the Autoscaler and Retry microservices which are responsible for automated scaling of scheduler services and rescheduling of failed pods respectively. One of Epsilon goals is to allow developers to commit changes quickly. Epsilon achieves this by splitting up the scheduler code into multiple smaller microservices. By spreading out the scheduler code in this way, developers can develop or update different components of the scheduler concurrently, reducing the time taken to commit the changes. Epsilon’s distributed nature also provide an opportunity to scale in or out the scheduler microservices to improve performance and resiliency due to having multiple identical copies of the scheduler microservices operating concurrently. Epsilon was deployed and tested on a 54 node Kubernetes cluster using Amazon Web Services EC2 instances. To improve the scheduler’s resiliency, Epsilon was deployed with 3 scheduler microservices. As multiple schedulers are making scheduling decisions concurrently, randomization is used as a mitigation technique to reduce the occurrence of scheduler conflicts between all 3 scheduler microservices. The experiments conducted included various tests related to the the scheduler’s performance, load balancing, and support for heterogeneous workloads. In one of the experiments, the scheduling performance of Epsilon was compared to the default Kubernetes scheduler. The experiment involves recording the time taken for each scheduler to schedule different amount of pods. Bachelor of Engineering (Computer Engineering) 2021-04-08T13:06:49Z 2021-04-08T13:06:49Z 2021 Final Year Project (FYP) Neo, A. J. H. (2021). Epsilon, a cluster scheduler for Kubernetes clusters. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/147629 https://hdl.handle.net/10356/147629 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computer applications Engineering::Computer science and engineering::Information systems |
spellingShingle |
Engineering::Computer science and engineering::Computer applications Engineering::Computer science and engineering::Information systems Neo, Alex Jing Hui Epsilon, a cluster scheduler for Kubernetes clusters |
description |
The adoption of shared computer clusters for executing high-performance computing workloads has allowed many organizations with limited financial capabilities to access computing power that otherwise might be too costly for them to build. However, to support these heterogeneous workloads, cluster schedulers are getting increasingly complex to develop and maintain as features are added based on different workload requirements.
Kubernetes is a container orchestration platform designed to simplify the deployment of containers in a computer cluster. Kubernetes provides a monolithic cluster scheduler, which is responsible for allocating resources to containers. The author developed Epsilon as a cluster scheduler for Kubernetes using the microservices model as a foundation.
In partnership with AsiaConnect, Epsilon's goal is to act as a scheduler and a starting point for examining the viability of using microservices to create a scheduler that helps developers implement updates quicker and support for heterogeneous workloads.
Being a microservices-based scheduler, Epsilon is build using multiple microservices with strict service boundaries and are kept simple in design to prevent increasing code complexity due to modifications or feature updates. Epsilon contains multiple microservices for different functionalities, with some microservices making up the core system and the remaining as supporting features. The core microservices consist of the Coordinator, Scheduler, and Queue microservices which are responsible for the monitoring of new pods, scheduling of new pods, and communication between microservices respectively. Support microservices include the Autoscaler and Retry microservices which are responsible for automated scaling of scheduler services and rescheduling of failed pods respectively.
One of Epsilon goals is to allow developers to commit changes quickly. Epsilon achieves this by splitting up the scheduler code into multiple smaller microservices. By spreading out the scheduler code in this way, developers can develop or update different components of the scheduler concurrently, reducing the time taken to commit the changes. Epsilon’s distributed nature also provide an opportunity to scale in or out the scheduler microservices to improve performance and resiliency due to having multiple identical copies of the scheduler microservices operating concurrently.
Epsilon was deployed and tested on a 54 node Kubernetes cluster using Amazon Web Services EC2 instances. To improve the scheduler’s resiliency, Epsilon was deployed with 3 scheduler microservices. As multiple schedulers are making scheduling decisions concurrently, randomization is used as a mitigation technique to reduce the occurrence of scheduler conflicts between all 3 scheduler microservices.
The experiments conducted included various tests related to the the scheduler’s performance, load balancing, and support for heterogeneous workloads. In one of the experiments, the scheduling performance of Epsilon was compared to the default Kubernetes scheduler. The experiment involves recording the time taken for each scheduler to schedule different amount of pods. |
author2 |
Lee Bu Sung, Francis |
author_facet |
Lee Bu Sung, Francis Neo, Alex Jing Hui |
format |
Final Year Project |
author |
Neo, Alex Jing Hui |
author_sort |
Neo, Alex Jing Hui |
title |
Epsilon, a cluster scheduler for Kubernetes clusters |
title_short |
Epsilon, a cluster scheduler for Kubernetes clusters |
title_full |
Epsilon, a cluster scheduler for Kubernetes clusters |
title_fullStr |
Epsilon, a cluster scheduler for Kubernetes clusters |
title_full_unstemmed |
Epsilon, a cluster scheduler for Kubernetes clusters |
title_sort |
epsilon, a cluster scheduler for kubernetes clusters |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/147629 |
_version_ |
1696984372782039040 |