HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION

The advancement of cloud computing technology demands optimal methods for managing computing resources. Kubernetes, as a container orchestration platform, can be used to manage the deployment of microservice applications. However, autoscaling in Kubernetes is still limited to low-level metrics su...

Full description

Saved in:

Bibliographic Details
Main Author:	Phandiarta, Brianaldo
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/85061
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:85061
spelling	id-itb.:850612024-08-19T14:08:15ZHIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION Phandiarta, Brianaldo Indonesia Final Project Kubernetes, proactive autoscaling, application traffic, tail latency. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/85061 The advancement of cloud computing technology demands optimal methods for managing computing resources. Kubernetes, as a container orchestration platform, can be used to manage the deployment of microservice applications. However, autoscaling in Kubernetes is still limited to low-level metrics such as CPU and memory usage. High-level metrics like Service Level Objectives (SLOs) are more crucial because they reflect the overall performance requirements of an application. Additionally, autoscaling in Kubernetes does not yet utilize application information, such as past application traffic. This results in reactive autoscaling, which can reduce the availability of applications. To address this issue, a proactive autoscaling system was developed that leverages application information, specifically application traffic patterns, and uses high-level metrics such as tail latency SLO. This system was created using the PatchTST time series model to forecast traffic and a neural network regression model to predict the tail latency of each service. The number of replicas is determined based on the ratio of tail latency to the SLO threshold. Test results showed that the developed autoscaling system can improve application availability compared to the Horizontal Pod Autoscaler (HPA) by 0.0411%, increasing it from 99.7402% to 99.7813%. The autoscaling system also reduced the average tail latency from 302.05 ms to 277.62 ms, although it has not yet reduced it below the threshold. The ratio of tail latency to the SLO threshold exhibited oscillations in the number of replicas, making it ineffective as a determinant for the number of replicas. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	The advancement of cloud computing technology demands optimal methods for managing computing resources. Kubernetes, as a container orchestration platform, can be used to manage the deployment of microservice applications. However, autoscaling in Kubernetes is still limited to low-level metrics such as CPU and memory usage. High-level metrics like Service Level Objectives (SLOs) are more crucial because they reflect the overall performance requirements of an application. Additionally, autoscaling in Kubernetes does not yet utilize application information, such as past application traffic. This results in reactive autoscaling, which can reduce the availability of applications. To address this issue, a proactive autoscaling system was developed that leverages application information, specifically application traffic patterns, and uses high-level metrics such as tail latency SLO. This system was created using the PatchTST time series model to forecast traffic and a neural network regression model to predict the tail latency of each service. The number of replicas is determined based on the ratio of tail latency to the SLO threshold. Test results showed that the developed autoscaling system can improve application availability compared to the Horizontal Pod Autoscaler (HPA) by 0.0411%, increasing it from 99.7402% to 99.7813%. The autoscaling system also reduced the average tail latency from 302.05 ms to 277.62 ms, although it has not yet reduced it below the threshold. The ratio of tail latency to the SLO threshold exhibited oscillations in the number of replicas, making it ineffective as a determinant for the number of replicas.
format	Final Project
author	Phandiarta, Brianaldo
spellingShingle	Phandiarta, Brianaldo HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
author_facet	Phandiarta, Brianaldo
author_sort	Phandiarta, Brianaldo
title	HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
title_short	HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
title_full	HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
title_fullStr	HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
title_full_unstemmed	HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
title_sort	high-level information and metrics-aware kubernetes autoscaling for microservice application
url	https://digilib.itb.ac.id/gdl/view/85061
_version_	1822998905973899264

HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION

Similar Items