Distributed recovery block scheme-based fault tolerant message passing system

This thesis presents a fault-tolerant message passing system incorporating a variation of the distributed recovery block approach. Inter processor communication is one of the key activities of parallel and distributed computer systems. Message passing in large interconnection networks is a critical...

Full description

Saved in:
Bibliographic Details
Main Author: Gu, Wei.
Other Authors: Khan, Gul Nawaz
Format: Theses and Dissertations
Published: 2008
Subjects:
Online Access:http://hdl.handle.net/10356/2646
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Description
Summary:This thesis presents a fault-tolerant message passing system incorporating a variation of the distributed recovery block approach. Inter processor communication is one of the key activities of parallel and distributed computer systems. Message passing in large interconnection networks is a critical part of high performance computing and it has attracted a great deal of attention in the recent years. In many applications, the requirements for efficient inter processor communication and system reliability are increasing. However, in most of the general-purpose parallel and distributed systems, little attention is given to this potential problem. The aim of this research is to develop a fault-tolerant and adaptive message passing system that assures a successful delivery of the messages even under faulty conditions. This thesis presents an investigation of fault-tolerant routing algorithms for unicast, multicast and broadcast that deliver messages as long as a healthy path exists between the source and destination nodes.