scispace - formally typeset
Search or ask a question

Showing papers by "Moinuddin K. Qureshi published in 2013"


Proceedings ArticleDOI
23 Jun 2013
TL;DR: This paper proposes ArchShield, an architectural framework that employs runtime testing to identify faulty DRAM cells and efficiently tolerate error-rates as higher as 10−4 (100x higher than ECC alone), causes less than 2% performance degradation, and still maintains 1-bit error tolerance against soft errors.
Abstract: DRAM scaling has been the prime driver for increasing the capacity of main memory system over the past three decades. Unfortunately, scaling DRAM to smaller technology nodes has become challenging due to the inherent difficulty in designing smaller geometries, coupled with the problems of device variation and leakage. Future DRAM devices are likely to experience significantly high error-rates. Techniques that can tolerate errors efficiently can enable DRAM to scale to smaller technology nodes. However, existing techniques such as row/column sparing and ECC become prohibitive at high error-rates.To develop cost-effective solutions for tolerating high error-rates, this paper advocates a cross-layer approach. Rather than hiding the faulty cell information within the DRAM chips, we expose it to the architectural level. We propose ArchShield, an architectural framework that employs runtime testing to identify faulty DRAM cells. ArchShield tolerates these faults using two components, a Fault Map that keeps information about faulty words in a cache line, and Selective Word-Level Replication (SWLR) that replicates faulty words for error resilience. Both Fault Map and SWLR are integrated in reserved area in DRAM memory. Our evaluations with 8GB DRAM DIMM show that ArchShield can efficiently tolerate error-rates as higher as 10−4 (100x higher than ECC alone), causes less than 2% performance degradation, and still maintains 1-bit error tolerance against soft errors.

178 citations


Proceedings ArticleDOI
23 Feb 2013
TL;DR: This paper proposes Refresh Pausing, a solution that is highly effective at alleviating the contention from refresh operations, and provides an average performance improvement of 5.1% for 8Gb devices, and becomes even more effective for future high-density technologies.
Abstract: DRAM cells rely on periodic refresh operations to maintain data integrity. As the capacity of DRAM memories has increased, so has the amount of time consumed in doing refresh. Refresh operations contend with read operations, which increases read latency and reduces system performance. We show that eliminating latency penalty due to refresh can improve average performance by 7.2%. However, simply doing intelligent scheduling of refresh operations is ineffective at obtaining significant performance improvement. This paper provides an alternative and scalable option to reduce the latency penalty due to refresh. It exploits the property that each refresh operation in a typical DRAM device internally refreshes multiple DRAM rows in JEDEC-based distributed refresh mode. Therefore, a refresh operation has well defined points at which it can potentially be Paused to service a pending read request. Leveraging this property, we propose Refresh Pausing, a solution that is highly effective at alleviating the contention from refresh operations. It provides an average performance improvement of 5.1% for 8Gb devices, and becomes even more effective for future high-density technologies. We also show that Refresh Pausing significantly outperforms the recently proposed Elastic Refresh scheme.

93 citations


Proceedings ArticleDOI
24 Jun 2013
TL;DR: FLAIR provides a Vmin of 485mv and maintains robustness to soft-error, while incurring a storage overhead of only one bit per cache line, and leverages the correction features of existing SECDED code to greatly improve on simple two-way replication.
Abstract: Voltage scaling is often limited by bit failures in large on-chip caches. Prior approaches for enabling cache operation at low voltages rely on correcting cache lines with multi-bit failures. Unfortunately, multi-bit Error Correcting Codes (ECC) incur significant storage overhead and complex logic. Our goal is to develop solutions that enable ultra-low voltage operation while incurring minimal changes to existing SECDED-based cache designs. We exploit the observation that only a small percentage of cache lines have multi-bit failures. We propose FLexible And Introspective Replication (FLAIR) that performs two-way replication for part of the cache during testing to maintain robustness, and disables lines with multi-bit failures after testing. FLAIR leverages the correction features of existing SECDED code to greatly improve on simple two-way replication. FLAIR provides a Vmin of 485mv (similar to ECC-8) and maintains robustness to soft-error, while incurring a storage overhead of only one bit per cache line.

33 citations


Patent
28 Feb 2013
TL;DR: In this article, a method for correcting errors on a DRAM having an ECC is described, which includes writing data to the DRAM row, reading data from the row, detecting errors in the data that cannot be corrected by the ECC and determining erasure information for the row.
Abstract: This disclosure includes a method for correcting errors on a DRAM having an ECC which includes writing data to a DRAM row, reading data from the DRAM row, detecting errors in the data that cannot be corrected by the DRAM's ECC, determining erasure information for the row, evaluating the errors using the erasure information, and correcting the errors in the data.

15 citations


Proceedings ArticleDOI
07 Mar 2013
TL;DR: This talk proposes Start-Gap, a simple and effective wear-leveling technique that incurs an overhead of only few bytes and still provides lifetime close to ideal wear leveling, and discusses the performance impact of high write latency, as most of the emerging memory technologies tend to have write latency much higher than the read latency.
Abstract: Summary form only given. As conventional memory technologies such as DRAM run into the scaling wall, architects and system designers are forced to look at alternative technologies for building future computer systems. Several emerging Non-Volatile Memory (NVM) technologies such as PCM, STT-RAM, and Memristors have the potential to boost memory capacity in a scalable and power-efficient manner. However, these technologies are not drop-in replacements and will require novel solutions to enable their deployment. Even the prime candidates among these technologies have their own set of challenges such as higher read latency (than DRAM), much higher write latency, and limited write endurance. In this talk, I will discuss some of our recent work that addresses these challenges. Our first solution is hybrid memory system that combines emerging memory technologies with a small DRAM buffer, thereby obtaining the latency of DRAM in the common case, and the higher capacity of emerging technologies. Such a hybrid memory system allows combining technologies that have latency higher than DRAM without having significant impact on read latency. Second, we target the problem of limited write endurance, which is common to many of the emerging memory technologies. This problem is exacerbated by non-uniformity in write traffic to memory, causing frequently written lines to fail much earlier than others thereby reducing system lifetime significantly. Unfortunately, existing wear-leveling techniques require large storage tables and indirection, resulting insignificant area and latency overheads. We propose Start-Gap, a simple and effective wear-leveling technique that incurs an overhead of only few bytes and still provides lifetime close to ideal wear leveling. Finally, I will discuss the performance impact of high write latency, as most of the emerging memory technologies tend to have write latency much higher than the read latency. While a higher write latency can typically be tolerated using buffers, once the write request is scheduled for service to a bank, it can still cause increased latency for later arriving read requests to the same bank. To avoid this latency penalty caused by contention from slow write operations, we propose write cancellation and write pausing as a means to tolerate slow writes.