scispace - formally typeset
S

Sheng Li

Researcher at Google

Publications -  74
Citations -  4927

Sheng Li is an academic researcher from Google. The author has contributed to research in topics: Cache & Interleaved memory. The author has an hindex of 20, co-authored 72 publications receiving 4176 citations. Previous affiliations of Sheng Li include Hewlett-Packard & University of Notre Dame.

Papers
More filters
Proceedings ArticleDOI

McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures

TL;DR: Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks at the 22nm technology node for both common in-order and out-of-order manycore designs shows that when die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taking into account configuring clusters with 4 cores gives thebest EDA2P and EDAP.
Proceedings ArticleDOI

Kiln: closing the performance gap between systems with and without persistence support

TL;DR: Kiln is a persistent memory design that adopts a nonvolatile cache and aNonvolatile main memory to enable atomic in-place updates without logging or copy-on-write and can achieve 2× performance improvement compared with NVRAM-based persistent memory with write-ahead logging.
Proceedings ArticleDOI

CACTI-P: architecture-level modeling for SRAM-based structures with advanced leakage reduction techniques

TL;DR: It is found that although nanosecond scale power-gating is a powerful way to minimize leakage power for all levels of caches, its severe impacts on processor performance and energy when being used for L1 data caches make nanose Cond scalePower-Gating a better fit for caches closer to main memory.
Journal ArticleDOI

The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing

TL;DR: Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks for manycore designs at the 22nm technology shows that 8-core clustering gives the best energy-delay product, whereas when die area is taken into account, 4-core clusters give the best EDA2P and EDAP.
Proceedings ArticleDOI

CACTI-3DD: architecture-level modeling for 3D die-stacked DRAM main memory

TL;DR: CACTI-3DD is introduced, the first architecture-level integrated power, area, and timing modeling framework for 3D die-stacked off-chip DRAM main memory, and the results show that the 3D DRAM with re-architected DRAM dies achieves significant improvements in power and timing compared to the coarse-grained 3DDie-Stacked DRAM.