Enjoy your observability: An industrial survey of microservice tracing and analysis

Microservice systems are often deployed in complex cloud-based environments and may involve a large number of service instances being dynamically created and destroyed. It is thus essential to ensure observability to understand these microservice systems’ behaviors and troubleshoot their problems. A...

Full description

Saved in:
Bibliographic Details
Main Authors: LI, Bowen, PENG, Xin, XIANG, Qilin, WANG, Hanzhang, XIE, Tao, SUN, Jun, LIU, Xuanzhe
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6942
https://ink.library.smu.edu.sg/context/sis_research/article/7945/viewcontent/Li2021_Article_EnjoyYourObservability_pvoa.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Microservice systems are often deployed in complex cloud-based environments and may involve a large number of service instances being dynamically created and destroyed. It is thus essential to ensure observability to understand these microservice systems’ behaviors and troubleshoot their problems. As an important means to achieve the observability, distributed tracing and analysis is known to be challenging. While many companies have started implementing distributed tracing and analysis for microservice systems, it is not clear whether existing approaches fulfill the required observability. In this article, we present our industrial survey on microservice tracing and analysis through interviewing developers and operation engineers of microservice systems from ten companies. Our survey results offer a number of findings. For example, large microservice systems commonly adopt a tracing and analysis pipeline, and the implementations of the pipeline in different companies reflect different tradeoffs among a variety of concerns. Visualization and statistic-based metrics are the most common means for trace analysis, while more advanced analysis techniques such as machine learning and data mining are seldom used. Microservice tracing and analysis is a new big data problem for software engineering, and its practices breed new challenges and opportunities.