scispace - formally typeset
V

Vilas Sridharan

Researcher at Advanced Micro Devices

Publications -  70
Citations -  2013

Vilas Sridharan is an academic researcher from Advanced Micro Devices. The author has contributed to research in topics: Cache & CPU cache. The author has an hindex of 19, co-authored 65 publications receiving 1804 citations. Previous affiliations of Vilas Sridharan include Northeastern University.

Papers
More filters
Proceedings ArticleDOI

Memory Errors in Modern Systems: The Good, The Bad, and The Ugly

TL;DR: This study uses data from two leadership-class high-performance computer systems to analyze the reliability impact of hardware resilience schemes that are deployed in current systems and finds that counting errors instead of faults, a common practice among researchers and data center operators, can lead to incorrect conclusions about system reliability.
Proceedings ArticleDOI

A study of DRAM failures in the field

TL;DR: DRAM failures are dominated by permanent, rather than transient, faults, although not to the extent found by previous publications, and chipkill error-correcting codes (ECC) are extremely effective, reducing the node failure rate from uncorrected DRAM errors by 42x compared to single-error correct/double-error detect (SEC-DED) ECC.
Proceedings ArticleDOI

Feng shui of supercomputer memory: positional effects in DRAM and SRAM faults

TL;DR: A study of DRAM and SRAM faults in large high-performance computing systems to understand the factors that influence faults in production settings and finds that altitude has a substantial impact onSRAM faults, and that top of rack placement correlates with 20% higher fault rate.
Proceedings ArticleDOI

Eliminating microarchitectural dependency from Architectural Vulnerability

TL;DR: This work demonstrates that the new Program Vulnerability Factor (PVF) metric provides such a basis: PVF captures the architecture-level fault masking inherent in a program, allowing software designers to make quantitative statements about a program's tolerance to soft errors.
Proceedings ArticleDOI

Balancing Performance and Reliability in the Memory Hierarchy

TL;DR: A new method to accurately estimate the reliability of cache memories is presented and three different techniques are presented to reduce the susceptibility of first-level caches to soft errors by two orders of magnitude.