Voting algorithms for large scale fault-tolerant systems
Fault tolerance is the feature of computing systems which can continue their normal operation in the presence of fault(s). In line with this, various methods have been proposed in the last three decades. Redundancy is one of the methods in building fault-tolerant systems which is implementable at bo...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2011
|
Online Access: | http://psasir.upm.edu.my/id/eprint/42337/1/FK%202011%20105R.pdf http://psasir.upm.edu.my/id/eprint/42337/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Putra Malaysia |
Language: | English |
Summary: | Fault tolerance is the feature of computing systems which can continue their normal operation in the presence of fault(s). In line with this, various methods have been proposed in the last three decades. Redundancy is one of the methods in building fault-tolerant systems which is implementable at both hardware and software levels. In these systems, voting algorithms are extensively employed to arbitrate among the results of N redundant hardware modules or software versions. They have a wide range of application in which the goal is to decrease the probability of the system hazardous behavior. So far, various voting methods are proposed which are mostly proper for small-scale systems. In this research, we proposed optimal algorithms using Divide and Conquer, Brent’s theorem and parallel algorithms, appropriate for today’s large scale systems such as satellite processing systems, traffic control, weather forecasting which all face a large quantity of processing input data. To do so, we have introduced a new sequential m-out-of-n algorithm with prediction ability which is known as enhanced m-out-of-n voter. Using a proper test-bed, after 10000 run times, we compared our newly proposed algorithm with the basic algorithm in terms of reliability and availability. By extracting various plots from different aspects, we demonstrated that, compared to the basic algorithm, our algorithm has higher reliability and availability as the quantity of the input data increases. Then, we introduced an appropriate system suitable for analytical modeling. It is called Predictive Hybrid m-out-of-n redundancy (PHmn) which is applicable for such systems as X-by-wire. To investigate the reliability and availability of this structure, discrete Markov models were obtained for reliability and availability analysis. The results of analytical modeling, based on different values of N, M, λ (failure rate), and μ (repair rate) demonstrated that the availability and reliability of the analytical modeling verify simulation result. Among basic voting algorithms, average voter and weighted average voter have higher availability but unfortunately they have higher time and calculation complexity in large scale systems. To solve this problem and gain benefits of this algorithm, we employed parallel algorithm technique and by using optimal number of processors, we could propose optimal algorithms known as Parallel Average Voting and Parallel Weighted Average Voting which both have optimal time complexity and less calculation cost. Since Plurality voting, among the popular and widely applied algorithms, has more correct responses even more than the most known and practical voting algorithm like majority and by relying on the benefit of parallel algorithms, we proposed parallel plurality voting with the minimum number of processors in an optimal time compared to its sequential type. In addition to having all the features of sequential algorithm, this algorithm has far less time complexity and has higher processing speed in voting process in large scale systems. In a nut shell, we tried to introduce voting algorithms and structures suitable for large scale fault-tolerant systems which have optimal and proper time complexity (in parallel voting algorithms) and more reliability and availability (in enhanced m-out-of-n voting algorithm) compared to the basic types. |
---|