Threat of soft errors

With aggressive technology scaling, the soft error rate in even earth-bound embedded systems manufactured in deep sub-nanometer technology is projected to become a serious design consideration. ITRS and many researchers expect the soft error rate to increase exponentially at every technology. A high-energy radiation particle, e.g., an alpha particle, a neutron, or a free proton, may strike the diffusion region of a CMOS transistor and produce a charge, resulting in toggling the logic value of the gates or flip-flops. This phenomenon of change in a transistor's logic state is called a soft error or transient fault.

Challenges for dependable embedded systems

To protect embedded systems from soft errors, conventional redundancy techniques such as TMR (Triple Modular Redundancy) and ECC (Error Correction Codes) incur high overheads in area, power, and performance. For instance, the overheads of hardware and power for conventional TMR typically use three functionally identical replicas of a logic circuit and a majority voter, exceeding 200%. Since embedded systems are constrained with limited resources such as area, power, and performance, there need several emerging challenges that embedded system designers face at the microarchitectural level, compilation, and system level. For dependable embedded systems against soft error threats, solutions should accomplish high reliability with the least overheads of power, performance, and area cost.

Reading list