HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION

The advancement of cloud computing technology demands optimal methods for managing computing resources. Kubernetes, as a container orchestration platform, can be used to manage the deployment of microservice applications. However, autoscaling in Kubernetes is still limited to low-level metrics su...

Full description

Saved in:
Bibliographic Details
Main Author: Phandiarta, Brianaldo
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/85061
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:The advancement of cloud computing technology demands optimal methods for managing computing resources. Kubernetes, as a container orchestration platform, can be used to manage the deployment of microservice applications. However, autoscaling in Kubernetes is still limited to low-level metrics such as CPU and memory usage. High-level metrics like Service Level Objectives (SLOs) are more crucial because they reflect the overall performance requirements of an application. Additionally, autoscaling in Kubernetes does not yet utilize application information, such as past application traffic. This results in reactive autoscaling, which can reduce the availability of applications. To address this issue, a proactive autoscaling system was developed that leverages application information, specifically application traffic patterns, and uses high-level metrics such as tail latency SLO. This system was created using the PatchTST time series model to forecast traffic and a neural network regression model to predict the tail latency of each service. The number of replicas is determined based on the ratio of tail latency to the SLO threshold. Test results showed that the developed autoscaling system can improve application availability compared to the Horizontal Pod Autoscaler (HPA) by 0.0411%, increasing it from 99.7402% to 99.7813%. The autoscaling system also reduced the average tail latency from 302.05 ms to 277.62 ms, although it has not yet reduced it below the threshold. The ratio of tail latency to the SLO threshold exhibited oscillations in the number of replicas, making it ineffective as a determinant for the number of replicas.