DEVELOPMENT OF CASCADING CIRCUIT BREAKER SYSTEM USING EVENT-DRIVEN APPROACH IN MICROSERVICES
Resilience, particularly in handling service failures, is a significant challenge in microservices. This can be addressed using circuit breakers to limit calls to failed services. However, changes in the state of circuit breakers are only known by directly connected services, allowing subsequent...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/85066 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Resilience, particularly in handling service failures, is a significant challenge in
microservices. This can be addressed using circuit breakers to limit calls to failed
services. However, changes in the state of circuit breakers are only known by
directly connected services, allowing subsequent services in the chain to continue
making calls. One study addresses this issue by creating a cascading circuit breaker
system to prevent premature calls to failed services. Nevertheless, this system has
limitations, such as the inability to inform newly deployed services and services
that are independent of the service aware of the state change. Moreover, there are
research opportunities to improve how circuit breakers operate.
To address those existing problems, this final project develops an event-driven
cascading circuit breaker system. This approach is chosen because an event-driven
communication style offers better flexibility compared to the request-response style
used in the existing cascading circuit breaker system. Additionally, improvements
are made by providing a mechanism to reroute calls to alternative endpoints when
an endpoint is in an open state. The cascading circuit breaker system is deployed as
a sidecar to offer enhanced flexibility.
The system was successfully implemented and orchestrated using Kubernetes. The
information regarding the open state change was successfully broadcasted to both
new services and services that are independent of the service aware of the state
change. Moreover, the system successfully rerouted calls to alternative endpoints
when an endpoint was in an open state. The system's average overhead is relatively
low, at 2.86 ms. On the other hand, the time required to propagate the state change
was measured at 2.12 minutes. |
---|