scispace - formally typeset
Search or ask a question
Author

He Peng

Bio: He Peng is an academic researcher from University of California, San Diego. The author has contributed to research in topics: Speedup & Very-large-scale integration. The author has an hindex of 8, co-authored 11 publications receiving 140 citations. Previous affiliations of He Peng include University of California, Berkeley.

Papers
More filters
Proceedings ArticleDOI
19 Jan 2009
TL;DR: An efficient parallel transistor level full-chip circuit simulation tool with SPICE-accuracy with orders of magnitude speedup over SPICE is observed for sets of large-scale VLSI circuits.
Abstract: This paper presents an efficient parallel transistor level full-chip circuit simulation tool with SPICE-accuracy. The new approach partitions the circuit into a linear domain and several non-linear domains based on circuit non-linearity and connectivity. The linear domain is solved by parallel fast linear solver while nonlinear domains are parallelly distributed into different processors and solved by direct solver. Parallel domain decomposition technique is used to iteratively solve the different partitions of the circuit and ensure convergence. Different domain decomposition techniques are discussed. Orders of magnitude speedup over SPICE is observed for sets of large-scale VLSI circuits.

27 citations

Proceedings ArticleDOI
20 Apr 2009
TL;DR: This work proposes reliability aware through silicon via (TSV) planning for the 3D stacked silicon integrated circuits (ICs) by modeled and extracted in frequency domain which includes the impact of skin effect.
Abstract: This work proposes reliability aware through silicon via (TSV) planning for the 3D stacked silicon integrated circuits (ICs). The 3D power distribution network is modeled and extracted in frequency domain which includes the impact of skin effect. The worst case power noise of the 3D power delivery networks (PDN) with local TSV failures resulting from fabrication process or circuit operation is identified in both frequency and time domain. From the experimental results, it is observed that a single TSV failure could increase the maximum voltage variation up to 70% which should be considered in nanoscale ICs. The parameters of the 3D PDN are designed such that the power distribution is reliable under local TSV failures. The spatial distribution of the power noise, reliability and block out area is analyzed to enhance the reliability of the 3D PDN under local TSV failure.

22 citations

Proceedings ArticleDOI
18 Nov 2008
TL;DR: In this paper, the authors proposed an efficient flow for the analysis and co-design of large 3D power distribution networks (3D PDN), which can take advantage of parallel computing.
Abstract: In this paper, we propose an efficient flow for the analysis and co-design of large 3D power distribution networks (3D PDN). In this flow, the network is modeled in frequency domain and thus can take advantage of parallel computing. The proposed flow significantly reduces the CPU time while obtaining accurate results as compared to commercial simulation tools. In the established 3D PDN model, we incorporate the on-chip voltage regulator module (VRM) and effect of on-chip inductance. The impact of each design parameter of the 3D PDN on simultaneous switching noise (SSN) is investigated based on the model.

20 citations

Proceedings ArticleDOI
20 Apr 2009
TL;DR: A fully parallel transistor level full-chip circuit simulation tool with SPICE-accuracy for general circuit designs and the proposed overlapping domain decomposition approach partitions the circuit into a linear subdomain and multiple non-linear subdomains based on circuit non- linearity and connectivity.
Abstract: In this paper, we present a fully parallel transistor level full-chip circuit simulation tool with SPICE-accuracy for general circuit designs. The proposed overlapping domain decomposition approach partitions the circuit into a linear subdomain and multiple non-linear subdomains based on circuit non-linearity and connectivity. Parallel iterative matrix solver is used to solve the linear domain while non-linear subdomains are parallelly distributed into different processors topologically and solved by direct solver. To achieve maximum parallelism, device model evaluation is done parallelly. Parallel domain decomposition technique is used to iteratively solve the different partitions of the circuit and ensure convergence. Orders of magnitude speedup over SPICE is observed for sets of large-scale circuit designs on up to 64 processors.

15 citations

Journal ArticleDOI
TL;DR: An efficient transistor-level simulation tool with SPICE-accuracy for deep-submicrometer very large-scale integration circuits with strong-coupling effects with orders-of-magnitude speedup over Berkeley SPICE3 is observed.
Abstract: In this paper, we introduce an efficient transistor-level simulation tool with SPICE-accuracy for deep-submicrometer very large-scale integration circuits with strong-coupling effects. The new approach uses multigrid for huge networks of power/ground, clock, and interconnect with strong coupling. Mutual inductance can be incorporated without error-prone matrix sparsification approximations or expensive matrix inversion. Transistor devices are integrated using a novel two-stage Newton-Raphson method to dynamically model the linear network and nonlinear devices boundary. Orders-of-magnitude speedup over Berkeley SPICE3 is observed for sets of real deep-submicrometer design circuits

14 citations


Cited by
More filters
Book
11 Mar 2009
TL;DR: EDA/VLSI practitioners and researchers in need of fluency in an "adjacent" field will find this an invaluable reference to the basic EDA concepts, principles, data structures, algorithms, and architectures for the design, verification, and test of VLSI circuits.
Abstract: This book provides broad and comprehensive coverage of the entire EDA flow. EDA/VLSI practitioners and researchers in need of fluency in an "adjacent" field will find this an invaluable reference to the basic EDA concepts, principles, data structures, algorithms, and architectures for the design, verification, and test of VLSI circuits. Anyone who needs to learn the concepts, principles, data structures, algorithms, and architectures of the EDA flow will benefit from this book. Covers complete spectrum of the EDA flow, from ESL design modeling to logic/test synthesis, verification, physical design, and test - helps EDA newcomers to get "up-and-running" quickly Includes comprehensive coverage of EDA concepts, principles, data structures, algorithms, and architectures - helps all readers improve their VLSI design competence Contains latest advancements not yet available in other books, including Test compression, ESL design modeling, large-scale floorplanning, placement, routing, synthesis of clock and power/ground networks - helps readers to design/develop testable chips or products Includes industry best-practices wherever appropriate in most chapters - helps readers avoid costly mistakes Table of Contents Chapter 1: Introduction Chapter 2: Fundamentals of CMOS Design Chapter 3: Design for Testability Chapter 4: Fundamentals of Algorithms Chapter 5: Electronic System-Level Design and High-Level Synthesis Chapter 6: Logic Synthesis in a Nutshell Chapter 7: Test Synthesis Chapter 8: Logic and Circuit Simulation Chapter 9:?Functional Verification Chapter 10: Floorplanning Chapter 11: Placement Chapter 12: Global and Detailed Routing Chapter 13: Synthesis of Clock and Power/Ground Networks Chapter 14: Fault Simulation and Test Generation.

200 citations

Journal ArticleDOI
TL;DR: A redundant TSV architecture with reasonable cost is proposed in this paper and analysis on overall yield shows that the proposed design can successfully recover most of the failed chips and increase the yield of TSV to 99.4%.
Abstract: 3-D technology provides many benefits including high density, high bandwidth, low-power, and small form-factor. Through Silicon Via (TSV), which provides communication links for dies in vertical direction, is a critical design issue in 3-D integration. Just like other components, the fabrication and bonding of TSVs can fail. A failed TSV can severely increase the cost and decrease the yield as the number of dies to be stacked increases. A redundant TSV architecture with reasonable cost is proposed in this paper. Based on probabilistic models, some interesting findings are reported. First, the number of failed TSVs in a tier is usually less than 2 when the number of TSVs in a tier is less than 1000 and less than 5 when the number of TSVs in a tier is less than 10000. Assuming that there are at most 2-5 failed TSVs in a tier. With one redundant TSV allocated to one TSV block, our proposed structure leads to 90% and 95% recovery rates for TSV blocks of size 50 and 25, respectively. Finally, analysis on overall yield shows that the proposed design can successfully recover most of the failed chips and increase the yield of TSV to 99.4%.

121 citations

Journal ArticleDOI
TL;DR: An adaptive sparse matrix solver called NICSLU is proposed, which uses a multithreaded parallel LU factorization algorithm on shared-memory computers with multicore/multisocket central processing units to accelerate circuit simulation.
Abstract: The sparse matrix solver has become a bottleneck in simulation program with integrated circuit emphasis (SPICE)-like circuit simulators. It is difficult to parallelize the solver because of the high data dependency during the numeric LU factorization and the irregular structure of circuit matrices. This paper proposes an adaptive sparse matrix solver called NICSLU, which uses a multithreaded parallel LU factorization algorithm on shared-memory computers with multicore/multisocket central processing units to accelerate circuit simulation. The solver can be used in all the SPICE-like circuit simulators. A simple method is proposed to predict whether a matrix is suitable for parallel factorization, such that each matrix can achieve optimal performance. The experimental results on 35 matrices reveal that NICSLU achieves speedups of 2.08× ~ 8.57×(on the geometric mean), compared with KLU, with 1-12 threads, for the matrices which are suitable for the parallel algorithm. NICSLU can be downloaded from http://nicslu.weebly.com.

66 citations

Journal ArticleDOI
TL;DR: This work provides a qualitative perspective of the power and thermal dissipation issues in 3-D and study the impact of Through Silicon Vias (TSVs) size for their mitigation and investigates and discusses the design implications in the presence of decoupling capacitors, TSV/on-die/package parasitics, various resonance effects and power gating.
Abstract: 3-D integration presents a path to higher performance, greater density, increased functionality and heterogeneous technology implementation. However, 3-D integration introduces many challenges for power and thermal integrity due to large switching currents, longer power delivery paths, and increased parasitics compared to 2-D integration. In this work, we provide an in-depth study of power and thermal issues while incorporating the physical design characteristics unique to 3-D integration. We provide a qualitative perspective of the power and thermal dissipation issues in 3-D and study the impact of Through Silicon Vias (TSVs) size for their mitigation. We investigate and discuss the design implications of power and thermal issues in the presence of decoupling capacitors, TSV/on-die/package parasitics, various resonance effects and power gating. Our study is based on a ten-tier system utilizing existing 3-D technology specifications. Based on detailed power distribution and heat dissipation models, we present a comprehensive analysis of TSV tapering for alleviating power and thermal integrity issues in 3-D ICs.

63 citations