scispace - formally typeset
Journal ArticleDOI

Trends and challenges in VLSI circuit reliability

C. Constantinescu
- 01 Jul 2003 - 
- Vol. 23, Iss: 4, pp 14-19
TLDR
The main trends and challenges in circuit reliability are discussed, and evolving techniques for dealing with them are explained.
Citations
More filters
Proceedings ArticleDOI

Memory Errors in Modern Systems: The Good, The Bad, and The Ugly

TL;DR: This study uses data from two leadership-class high-performance computer systems to analyze the reliability impact of hardware resilience schemes that are deployed in current systems and finds that counting errors instead of faults, a common practice among researchers and data center operators, can lead to incorrect conclusions about system reliability.
Proceedings ArticleDOI

A study of DRAM failures in the field

TL;DR: DRAM failures are dominated by permanent, rather than transient, faults, although not to the extent found by previous publications, and chipkill error-correcting codes (ECC) are extremely effective, reducing the node failure rate from uncorrected DRAM errors by 42x compared to single-error correct/double-error detect (SEC-DED) ECC.
Proceedings Article

Flip Feng Shui: Hammering a Needle in the Software Stack

TL;DR: Flip Feng Shui (FFS) is introduced, a new exploitation vector which allows an attacker to induce bit flips over arbitrary physical memory in a fully controlled way and is exemplify end-to-end attacks breaking OpenSSH public-key authentication, and forging GPG signatures from trusted keys, thereby compromising the Ubuntu/Debian update mechanism.
Proceedings ArticleDOI

Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field

TL;DR: This paper analyzes the memory errors in the entire fleet of servers at Facebook over the course of fourteen months, representing billions of device days, and observes several new reliability trends for memory systems that have not been discussed before in literature.
Journal ArticleDOI

Methods for fault tolerance in networks-on-chip

TL;DR: The article at hand reviews the failure mechanisms, fault models, diagnosis techniques, and fault-tolerance methods in on-chip networks, and surveys and summarizes the research of the last ten years.
References
More filters
Proceedings ArticleDOI

Modeling the effect of technology trends on the soft error rate of combinational logic

TL;DR: An end-to-end model is described and validated that enables us to compute the soft error rates (SER) for existing and future microprocessor-style designs and predicts that the SER per chip of logic circuits will increase nine orders of magnitude from 1992 to 2011 and at that point will be comparable to the SERper chip of unprotected memory elements.
Journal ArticleDOI

Transient-fault recovery using simultaneous multithreading

TL;DR: A scheme for transient-fault recovery called Simultaneously and Redundantly Threaded processors with Recovery (SRTR) is proposed that enhances a previously proposed scheme for Transient-Fault detection, called Sim concurrently andRedundant Threaded (SRT) processors.
Journal ArticleDOI

Physical and predictive models of ultrathin oxide reliability in CMOS devices and circuits

TL;DR: In this article, the authors review the physics and statistics of dielectric wearout and breakdown in ultrathin SiO/sub 2/-based gate dielectrics, and discuss the nature of the electrical conduction through a breakdown spot and the effect of the oxide breakdown on device and circuit performance.
Proceedings ArticleDOI

Impact of CMOS process scaling and SOI on the soft error rates of logic processes

TL;DR: The SER impact of process scaling over four technology generations is reported and an experimental assessment of alpha and, for the first time, neutron SER on advanced SOI processes, which have been considered as a possible method to reduce the SER of advanced technologies.
Journal ArticleDOI

High availability and reliability in the itanium processor

Nhon Quach
- 01 Sep 2000 - 
TL;DR: The Itanium Processor is the first implementation of the Intel IA-64 architecture, designed for the high-end server market segment, and equipped with many advanced RAS features to maximize system reliability and availability.
Related Papers (5)