FAC: a fault-tolerant design approach based on approximate computing

This article introduces a new fault-tolerant design approach based on approximate computing, called FAC, for designing redundant circuits and systems. Traditionally, triple modular redundancy (TMR) has been used to ensure complete tolerance to any single fault or a faulty processing unit, where the...

Full description

Saved in:
Bibliographic Details
Main Authors: Balasubramanian, Padmanabhan, Maskell, Douglas Leslie
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/170487
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This article introduces a new fault-tolerant design approach based on approximate computing, called FAC, for designing redundant circuits and systems. Traditionally, triple modular redundancy (TMR) has been used to ensure complete tolerance to any single fault or a faulty processing unit, where the processing unit may be a circuit or a system. However, TMR incurs more than 200% overhead in terms of area and power compared to a single processing unit. Alternative redundancy approaches have been proposed in the literature to mitigate these overheads associated with TMR, but they provide only partial or moderate fault tolerance. Among the alternatives, majority voting-based reduced precision redundancy (MVRPR) may be useful for error-resilient applications such as digital signal processing. While MVRPR guarantees only moderate fault tolerance, the proposed FAC is well-suited for error-resilient applications and ensures 100% tolerance to any single fault or a faulty processing unit, like TMR. In this work, we evaluate the performance of TMR, MVRPR, and FAC for a digital image processing application. The image processing results obtained demonstrate the effectiveness of FAC. Moreover, when the processing unit is implemented using a 28-nm CMOS technology, FAC achieves significant improvements over TMR, including a 15.3% reduction in delay, a 19.5% reduction in area, and a 24.7% reduction in power. Compared to MVRPR, FAC exhibits notable enhancements, with an 18% reduction in delay, a 5.4% reduction in area, and an 11.2% reduction in power. When considering the power-delay product, which reflects energy efficiency, FAC demonstrates a 36.2% reduction compared to TMR and a 27.2% reduction compared to MVRPR. When considering the power-delay-area product, which represents design efficiency, FAC achieves a 48.7% reduction compared to TMR and a 31.1% reduction compared to MVRPR.