Fault-Tolerant Design for Safety-Critical Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Circuit and Signal Processing".

Deadline for manuscript submissions: 15 May 2024 | Viewed by 1977

Special Issue Editor


E-Mail Website
Guest Editor
School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
Interests: approximate computing; asynchronous circuits; computer arithmetic; digital integrated circuits; fault-tolerant design; reliability; logic synthesis
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Many real-world safety-critical applications such as space, aerospace, defense, nuclear power plants, electric power transmission and distribution, industrial control and automation, and banking and finance systems usually involve fault-tolerant design at the hardware and/or software level for enhanced reliability. Fault-tolerant design is essential to cope with faults or failures of underlying circuits and systems during routine operation and while operating in harsh environments. In this context, this Special Issue aims to deal with recent advances in fault-tolerant design at hardware and software levels. Given this, we invite the submission of high-quality academic and industrial research work on all aspects of fault-tolerant design and reliability for publication. The topics of interest are broad and address fault-tolerant electrical, electronic, computer, and communication systems, as mentioned below:

  • Methods for assessing the reliability of devices, circuits, and systems
  • Fault tolerance in low power electronics: microelectronics, nanoelectronics, and optoelectronics
  • Fault tolerance in memories
  • Fault tolerance in communication systems (networks, network-on-chip, etc.)
  • Fault tolerance in high-power electronics
  • Fault-tolerant design of electrical machines
  • Fault tolerance in renewable energy systems including solar, wind, wave, geothermal, etc.
  • Fault tolerance in emerging (post-CMOS) technologies
  • Software fault tolerance
  • Impact of radiation on reliability of devices, circuits, and systems
  • Reliability studies on MEMS, sensors, photonic devices, wafer-level packaging, and assembly and interconnects
  • Reliability of microwave devices and circuits
  • Modeling reliability versus aging in low-power and high-power electronics
  • Reliability assessment of batteries
  • Reliability assessment and prediction in space, aerospace, and automotive systems

Dr. Padmanabhan Balasubramanian
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 7360 KiB  
Article
FAC: A Fault-Tolerant Design Approach Based on Approximate Computing
by Padmanabhan Balasubramanian and Douglas L. Maskell
Electronics 2023, 12(18), 3819; https://doi.org/10.3390/electronics12183819 - 09 Sep 2023
Viewed by 791
Abstract
This article introduces a new fault-tolerant design approach based on approximate computing, called FAC, for designing redundant circuits and systems. Traditionally, triple modular redundancy (TMR) has been used to ensure complete tolerance to any single fault or a faulty processing unit, where the [...] Read more.
This article introduces a new fault-tolerant design approach based on approximate computing, called FAC, for designing redundant circuits and systems. Traditionally, triple modular redundancy (TMR) has been used to ensure complete tolerance to any single fault or a faulty processing unit, where the processing unit may be a circuit or a system. However, TMR incurs more than 200% overhead in terms of area and power compared to a single processing unit. Alternative redundancy approaches have been proposed in the literature to mitigate these overheads associated with TMR, but they provide only partial or moderate fault tolerance. Among the alternatives, majority voting-based reduced precision redundancy (MVRPR) may be useful for error-resilient applications such as digital signal processing. While MVRPR guarantees only moderate fault tolerance, the proposed FAC is well-suited for error-resilient applications and ensures 100% tolerance to any single fault or a faulty processing unit, like TMR. In this work, we evaluate the performance of TMR, MVRPR, and FAC for a digital image processing application. The image processing results obtained demonstrate the effectiveness of FAC. Moreover, when the processing unit is implemented using a 28-nm CMOS technology, FAC achieves significant improvements over TMR, including a 15.3% reduction in delay, a 19.5% reduction in area, and a 24.7% reduction in power. Compared to MVRPR, FAC exhibits notable enhancements, with an 18% reduction in delay, a 5.4% reduction in area, and an 11.2% reduction in power. When considering the power-delay product, which reflects energy efficiency, FAC demonstrates a 36.2% reduction compared to TMR and a 27.2% reduction compared to MVRPR. When considering the power-delay-area product, which represents design efficiency, FAC achieves a 48.7% reduction compared to TMR and a 31.1% reduction compared to MVRPR. Full article
(This article belongs to the Special Issue Fault-Tolerant Design for Safety-Critical Applications)
Show Figures

Figure 1

19 pages, 1004 KiB  
Article
Scalable Communication-Induced Checkpointing Protocol with Little Overhead for Distributed Computing Environments
by Jinho Ahn
Electronics 2023, 12(12), 2702; https://doi.org/10.3390/electronics12122702 - 16 Jun 2023
Viewed by 555
Abstract
The existing communication-induced checkpointing protocols may not scale well due to their slow acquisition of the most recent timestamps of the next checkpoints of other processes. Accurate situation awareness with diversified information conveyance paths is needed to reduce the number of unnecessary forced [...] Read more.
The existing communication-induced checkpointing protocols may not scale well due to their slow acquisition of the most recent timestamps of the next checkpoints of other processes. Accurate situation awareness with diversified information conveyance paths is needed to reduce the number of unnecessary forced checkpoints taken as few as possible. In this paper, a scalable communication-induced checkpointing protocol is proposed to considerably cut down the possibility of performing unnecessary forced checkpointing by exploiting the beneficial features of reliable communication channels. The protocol enables the sender of an application message to swiftly attain the most recent timestamp-related information of the next checkpoint of its receiver and accelerate the spread of the information to others, with little overhead. This behavioral feature may significantly elevate the accuracy of the awareness of the situations in which forced checkpointing is actually needed for useless checkpoint-free recovery. In addition, it generates no extra control message and no message logging overhead while significantly lessening the latency of message sending. Moreover, the protocol can always be operated under the non-deterministic execution model. The evaluation results indicate that the proposed protocol outperforms the existing ones at the reduced forced checkpointing overheads from 12.5% to 84.2%, and at the reduced total execution times from 2.5% to 11.5%. Full article
(This article belongs to the Special Issue Fault-Tolerant Design for Safety-Critical Applications)
Show Figures

Figure 1

Back to TopTop