Reliability assessment for distributed systems via communication abstraction and refinement

Distributed systems like cloud-based services are ever more popular. Assessing the reliability of distributed systems is highly non-trivial. Particularly, the order of executions among distributed components adds a dimension of non-determinism, which invalidates existing reliability assessment metho...

Full description

Saved in:
Bibliographic Details
Main Authors: GUI, Lin, SUN, Jun, LIU, Yang, DONG, Jin Song
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2015
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4955
https://ink.library.smu.edu.sg/context/sis_research/article/5958/viewcontent/2771783.2771794.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-5958
record_format dspace
spelling sg-smu-ink.sis_research-59582020-02-27T03:14:10Z Reliability assessment for distributed systems via communication abstraction and refinement GUI, Lin SUN, Jun LIU, Yang DONG, Jin Song Distributed systems like cloud-based services are ever more popular. Assessing the reliability of distributed systems is highly non-trivial. Particularly, the order of executions among distributed components adds a dimension of non-determinism, which invalidates existing reliability assessment methods based on Markov chains. Probabilistic model checking based on models like Markov decision processes is designed to deal with scenarios involving both probabilistic behavior (e.g., reliabilities of system components) and non-determinism. However, its application is currently limited by state space explosion, which makes reliability assessment of distributed system particularly difficult. In this work, we improve the probabilistic model checking through a method of abstraction and reduction, which controls the communications among system components and actively reduces the size of each component. We prove the soundness and completeness of the proposed approach. Through an implementation in a software toolkit and evaluations with several systems, we show that our approach often reduces the size of the state space by several orders of magnitude, while still producing sound and accurate assessment. 2015-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/4955 info:doi/10.1145/2771783.2771794 https://ink.library.smu.edu.sg/context/sis_research/article/5958/viewcontent/2771783.2771794.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University MDPs reliability assessment probabilistic model checking Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic MDPs
reliability assessment
probabilistic model checking
Software Engineering
spellingShingle MDPs
reliability assessment
probabilistic model checking
Software Engineering
GUI, Lin
SUN, Jun
LIU, Yang
DONG, Jin Song
Reliability assessment for distributed systems via communication abstraction and refinement
description Distributed systems like cloud-based services are ever more popular. Assessing the reliability of distributed systems is highly non-trivial. Particularly, the order of executions among distributed components adds a dimension of non-determinism, which invalidates existing reliability assessment methods based on Markov chains. Probabilistic model checking based on models like Markov decision processes is designed to deal with scenarios involving both probabilistic behavior (e.g., reliabilities of system components) and non-determinism. However, its application is currently limited by state space explosion, which makes reliability assessment of distributed system particularly difficult. In this work, we improve the probabilistic model checking through a method of abstraction and reduction, which controls the communications among system components and actively reduces the size of each component. We prove the soundness and completeness of the proposed approach. Through an implementation in a software toolkit and evaluations with several systems, we show that our approach often reduces the size of the state space by several orders of magnitude, while still producing sound and accurate assessment.
format text
author GUI, Lin
SUN, Jun
LIU, Yang
DONG, Jin Song
author_facet GUI, Lin
SUN, Jun
LIU, Yang
DONG, Jin Song
author_sort GUI, Lin
title Reliability assessment for distributed systems via communication abstraction and refinement
title_short Reliability assessment for distributed systems via communication abstraction and refinement
title_full Reliability assessment for distributed systems via communication abstraction and refinement
title_fullStr Reliability assessment for distributed systems via communication abstraction and refinement
title_full_unstemmed Reliability assessment for distributed systems via communication abstraction and refinement
title_sort reliability assessment for distributed systems via communication abstraction and refinement
publisher Institutional Knowledge at Singapore Management University
publishDate 2015
url https://ink.library.smu.edu.sg/sis_research/4955
https://ink.library.smu.edu.sg/context/sis_research/article/5958/viewcontent/2771783.2771794.pdf
_version_ 1770575157359804416