HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION

The advancement of cloud computing technology demands optimal methods for managing computing resources. Kubernetes, as a container orchestration platform, can be used to manage the deployment of microservice applications. However, autoscaling in Kubernetes is still limited to low-level metrics su...

Full description

Saved in:
Bibliographic Details
Main Author: Phandiarta, Brianaldo
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/85061
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:85061
spelling id-itb.:850612024-08-19T14:08:15ZHIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION Phandiarta, Brianaldo Indonesia Final Project Kubernetes, proactive autoscaling, application traffic, tail latency. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/85061 The advancement of cloud computing technology demands optimal methods for managing computing resources. Kubernetes, as a container orchestration platform, can be used to manage the deployment of microservice applications. However, autoscaling in Kubernetes is still limited to low-level metrics such as CPU and memory usage. High-level metrics like Service Level Objectives (SLOs) are more crucial because they reflect the overall performance requirements of an application. Additionally, autoscaling in Kubernetes does not yet utilize application information, such as past application traffic. This results in reactive autoscaling, which can reduce the availability of applications. To address this issue, a proactive autoscaling system was developed that leverages application information, specifically application traffic patterns, and uses high-level metrics such as tail latency SLO. This system was created using the PatchTST time series model to forecast traffic and a neural network regression model to predict the tail latency of each service. The number of replicas is determined based on the ratio of tail latency to the SLO threshold. Test results showed that the developed autoscaling system can improve application availability compared to the Horizontal Pod Autoscaler (HPA) by 0.0411%, increasing it from 99.7402% to 99.7813%. The autoscaling system also reduced the average tail latency from 302.05 ms to 277.62 ms, although it has not yet reduced it below the threshold. The ratio of tail latency to the SLO threshold exhibited oscillations in the number of replicas, making it ineffective as a determinant for the number of replicas. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description The advancement of cloud computing technology demands optimal methods for managing computing resources. Kubernetes, as a container orchestration platform, can be used to manage the deployment of microservice applications. However, autoscaling in Kubernetes is still limited to low-level metrics such as CPU and memory usage. High-level metrics like Service Level Objectives (SLOs) are more crucial because they reflect the overall performance requirements of an application. Additionally, autoscaling in Kubernetes does not yet utilize application information, such as past application traffic. This results in reactive autoscaling, which can reduce the availability of applications. To address this issue, a proactive autoscaling system was developed that leverages application information, specifically application traffic patterns, and uses high-level metrics such as tail latency SLO. This system was created using the PatchTST time series model to forecast traffic and a neural network regression model to predict the tail latency of each service. The number of replicas is determined based on the ratio of tail latency to the SLO threshold. Test results showed that the developed autoscaling system can improve application availability compared to the Horizontal Pod Autoscaler (HPA) by 0.0411%, increasing it from 99.7402% to 99.7813%. The autoscaling system also reduced the average tail latency from 302.05 ms to 277.62 ms, although it has not yet reduced it below the threshold. The ratio of tail latency to the SLO threshold exhibited oscillations in the number of replicas, making it ineffective as a determinant for the number of replicas.
format Final Project
author Phandiarta, Brianaldo
spellingShingle Phandiarta, Brianaldo
HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
author_facet Phandiarta, Brianaldo
author_sort Phandiarta, Brianaldo
title HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
title_short HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
title_full HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
title_fullStr HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
title_full_unstemmed HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
title_sort high-level information and metrics-aware kubernetes autoscaling for microservice application
url https://digilib.itb.ac.id/gdl/view/85061
_version_ 1822998905973899264