HIGH-LEVEL INFORMATION AND METRICS-AWARE KUBERNETES AUTOSCALING FOR MICROSERVICE APPLICATION
The advancement of cloud computing technology demands optimal methods for managing computing resources. Kubernetes, as a container orchestration platform, can be used to manage the deployment of microservice applications. However, autoscaling in Kubernetes is still limited to low-level metrics su...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/85061 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | The advancement of cloud computing technology demands optimal methods for
managing computing resources. Kubernetes, as a container orchestration platform,
can be used to manage the deployment of microservice applications. However,
autoscaling in Kubernetes is still limited to low-level metrics such as CPU and
memory usage. High-level metrics like Service Level Objectives (SLOs) are more
crucial because they reflect the overall performance requirements of an application.
Additionally, autoscaling in Kubernetes does not yet utilize application
information, such as past application traffic. This results in reactive autoscaling,
which can reduce the availability of applications.
To address this issue, a proactive autoscaling system was developed that leverages
application information, specifically application traffic patterns, and uses high-level
metrics such as tail latency SLO. This system was created using the PatchTST time
series model to forecast traffic and a neural network regression model to predict the
tail latency of each service. The number of replicas is determined based on the ratio
of tail latency to the SLO threshold.
Test results showed that the developed autoscaling system can improve application
availability compared to the Horizontal Pod Autoscaler (HPA) by 0.0411%,
increasing it from 99.7402% to 99.7813%. The autoscaling system also reduced the
average tail latency from 302.05 ms to 277.62 ms, although it has not yet reduced
it below the threshold. The ratio of tail latency to the SLO threshold exhibited
oscillations in the number of replicas, making it ineffective as a determinant for the
number of replicas. |
---|