As semiconductor technology progress toward nano-scale, increasing design complexity as well as costly production testing and burn-in make it more difficult to ensure the shipment of failure-free chips. Additional in-field failure sources such as infant mortality, soft errors, silicon aging, and electro-migration contribute to quality degradation as well. To increase in-field chip availability, online checking followed by a fault circumvention process could be a promising direction. Such a solution would result in a lower product return rate and service cost. The area overhead and performance penalties of existing online checking approaches are very significant. Thus, these online checking solutions would not be suitable for cost-sensitive applications such as most consumer electronics.
To reduce hardware overhead, we propose an online checking scheme "Time-Multiplexed Online Checking (TMOC)" that offers sufficient fault coverage with less overhead at the cost of increased fault detection latency. In TMOC, a design is partitioned into modules, each of which has its own TMOC checker. One or more embedded, field-reprogrammable blocks are used as shared checker spaces. Several TMOC checkers for different modules are sequentially and periodically mapped into a shared field-reprogrammable checker space in a time interleaved fashion.
TMOC can be applied to systems that can tolerate a certain level of fault detection latency and that implemented either solely in FPGA or in SoC/SiP with an embedded field-reprogrammable block. We have successfully implemented this TMOC scheme on a set of arithmetic circuits and Finite State Machines (FSMs) without disturbing system operations on a Virtex II Pro board. A case study on a JPEG codec design and a demonstration based on a TV-to-VGA decoder demonstrated the feasibility of applying TMOC to complex designs by employing the proposed state synchronization technique. The experiment results showed that a significant checker area overhead reduction can be achieved when the design is properly partitioned.
Publications
[1] Ming Gao, Sherman Chang, Peter Lisherness, Tim Cheng, "Time-Multiplexed Online Checking: A Feasibility Study," in Proceedings of the 17th Asian Test Symposium (November 24 - 27, 2008). ATS'08. IEEE Computer Society, Sapporo, Japan.
[Paper]
[Presentation Slides]
[4] Ming Gao, Tim Cheng, "Time-Multiplexed Online Checking: Resilient Design for Cost-Sensitive SoCs," Thesis Research Poster Session, IEEE VLSI Test Symposium (VTS'08), April 2008.
[3] Ming Gao, Sherman Chang, Peter Lisherness, Tim Cheng, "Time-Multiplexed Online Checking: A Feasibility Study," IEEE International Workshop in Synthesis and Testing (ITSW'08), UCSB, April 2008.
[2] Ming Gao, "Resilient Design for Cost-Sensitive SoCs," Ph.D. Thesis Proposal, Ph.D. Qualifying Examination, ECE, UCSB, March 2008.
[1] Ming Gao, Sherman Chang, Tim Cheng, "Self-Healing Emulation on BEE2 System," Resilient Theme Poster Session, GSRC Annual Symposium, San Jose, California, September 2007.