Research

Semiconductor devices become faster and denser, while circuits are being more susceptible to transient errors. As the differences between voltage levels and noise margins decrease, neutron particles from cosmic rays and alpha particles from impurities of packaging materials may alter encoded data values and the output of circuits. Since embedded processors are deployed in critical applications, transient errors become extremely significant design aspects in embedded systems. Even any failure or abnormal behavior in non-critical applications such as entertainment or multimedia systems would jeopardize the manufacturer's reputation. Thus analyzing vulnerabilities to transient errors and protection of systems against errors is as important as performance, power, and yield in semiconductor industries. Designing and implementing reliable embedded systems require error detection and correction mechanisms in different layers of the system stack; for example, circuit, micro-architecture, operating system kernel and application levels.

The rates of soft errors in the VLSI circuits are increased because of technology scaling and miniaturization of circuits. Investigation and research have been going on more that three decades to mitigate soft errors. Works are available for detection and protection of soft errors in different levels of computer systems. First the author will discuss the vulnerability of a hardware device to soft failures and then discuss the techniques to detect and protect errors in computer systems and how to verify those schemes by error injection techniques.