Author
Dennis Sylvester
Other affiliations: Synopsys, University of California, Berkeley, Hewlett-Packard
Bio: Dennis Sylvester is an academic researcher from University of Michigan. The author has contributed to research in topics: CMOS & Low-power electronics. The author has an hindex of 79, co-authored 623 publications receiving 22988 citations. Previous affiliations of Dennis Sylvester include Synopsys & University of California, Berkeley.
Papers published on a yearly basis
Papers
More filters
22 Jan 2010
TL;DR: In this paper, the authors define and explore near-threshold computing (NTC), a design space where the supply voltage is approximately equal to the threshold voltage of the transistors.
Abstract: Power has become the primary design constraint for chip designers today. While Moore's law continues to provide additional transistors, power budgets have begun to prohibit those devices from actually being used. To reduce energy consumption, voltage scaling techniques have proved a popular technique with subthreshold design representing the endpoint of voltage scaling. Although it is extremely energy efficient, subthreshold design has been relegated to niche markets due to its major performance penalties. This paper defines and explores near-threshold computing (NTC), a design space where the supply voltage is approximately equal to the threshold voltage of the transistors. This region retains much of the energy savings of subthreshold operation with more favorable performance and variability characteristics. This makes it applicable to a broad range of power-constrained computing segments from sensors to high performance servers. This paper explores the barriers to the widespread adoption of NTC and describes current work aimed at overcoming these obstacles.
767 citations
01 Jan 2000
TL;DR: A new paradigm of predictive MOSFET and interconnect modeling is introduced to specifically address SPICE compatible parameters for future technology generations and comparisons with published data and 2D simulations are used to verify this predictive technology model.
Abstract: A new paradigm of predictive MOSFET and interconnect modeling is introduced. This approach is developed to specifically address SPICE compatible parameters for future technology generations. For a given technology node, designers can use default values or directly input L/sub eff/, T/sub ok/, V/sub t/, R/sub dsw/ and interconnect dimensions to instantly obtain a BSIM3v3 customized model for early stages of circuit design and research. Models for 0.18 /spl mu/m and 0.13 /spl mu/m technology nodes with L/sub eff/ down to 70 nm are currently available on the web. Comparisons with published data and 2D simulations are used to verify this predictive technology model.
544 citations
07 Jun 2004
TL;DR: It is shown that extending the voltage range below 1/2 Vdd will improve the energy efficiency for most processor designs, while extending this range to subthreshold operation is beneficial only for very specific applications and that operation deep in the subth threshold voltage range is never energy-efficient.
Abstract: Dynamic voltage scaling (DVS) is a popular approach for energy reduction of integrated circuits. Current processors that use DVS typically have an operating voltage range from full to half of the maximum Vdd. However, it is possible to construct designs that operate over a much larger voltage range: from full Vdd to subthreshold voltages. This possibility raises the question of whether a larger voltage range improves the energy efficiency of DVS. First, from a theoretical point of view, we show that for subthreshold supply voltages leakage energy becomes dominant, making "just in time completion" energy inefficient. We derive an analytical model for the minimum energy optimal voltage and study its trends with technology scaling. Second, we use the proposed model to study the workload activity of an actual processor and analyze the energy efficiency as a function of the lower limit of voltage scaling. Based on this study, we show that extending the voltage range below 1/2 Vdd will improve the energy efficiency for most processor designs, while extending this range to subthreshold operation is beneficial only for very specific applications. Finally, we show that operation deep in the subthreshold voltage range is never energy-efficient.
431 citations
01 Nov 1998
TL;DR: A comprehensive approach to accurately characterize the device and interconnect characteristics of present and future process generations is described, resulting in the generation of a representative strawman technology that is used in conjunction with analytical model simulation tools and empirical design data to obtain a realistic picture of the future of circuit design.
Abstract: We take a fresh look at the problems posed by deep submicron (DSM) geometries and re-open the investigation into how DSM effects are most likely to affect future design methodologies. We describe a comprehensive approach to accurately characterize the device and interconnect characteristics of present and future process generations. This approach results in the generation of a representative strawman technology that is used in conjunction with analytical model simulation tools and empirical design data to obtain a realistic picture of the future of circuit design. We then proceed to quantify the precise impact of interconnect, including delay degradation due to noise, on high performance ASIC designs. Having determined the role of interconnect in performance, we then reconsider the impact of future processes on ASIC design methodology.
322 citations
TL;DR: The proposed voltage reference for use in ultra-low power systems, referred to as the 2T voltage reference, which has been demonstrated in silicon across three CMOS technologies, is proposed, showing the design exhibits comparable spreads in TC and output voltage to existing voltage references in the literature.
Abstract: Sensing systems such as biomedical implants, infrastructure monitoring systems, and military surveillance units are constrained to consume only picowatts to nanowatts in standby and active mode, respectively. This tight power budget places ultra-low power demands on all building blocks in the systems. This work proposes a voltage reference for use in such ultra-low power systems, referred to as the 2T voltage reference, which has been demonstrated in silicon across three CMOS technologies. Prototype chips in 0.13 μm show a temperature coefficient of 16.9 ppm/°C (best) and line sensitivity of 0.033%/V, while consuming 2.22 pW in 1350 μm2. The lowest functional Vdd 0.5 V. The proposed design improves energy efficiency by 2 to 3 orders of magnitude while exhibiting better line sensitivity and temperature coefficient in less area, compared to other nanowatt voltage references. For process spread analysis, 49 dies are measured across two runs, showing the design exhibits comparable spreads in TC and output voltage to existing voltage references in the literature. Digital trimming is demonstrated, and assisted one temperature point digital trimming, guided by initial samples with two temperature point trimming, enables TC <; 50 ppm/°C and ±0.35% output precision across all 25 dies. Ease of technology portability is demonstrated with silicon measurement results in 65 nm, 0.13 μm, and 0.18 μm CMOS technologies.
322 citations
Cited by
More filters
TL;DR: Focusing on using probabilistic metrics such as average values or variance to quantify design objectives such as performance and power will lead to a major change in SoC design methodologies.
Abstract: On-chip micronetworks, designed with a layered methodology, will meet the distinctive challenges of providing functionally correct, reliable operation of interacting system-on-chip components. A system on chip (SoC) can provide an integrated solution to challenging design problems in the telecommunications, multimedia, and consumer electronics domains. Much of the progress in these fields hinges on the designers' ability to conceive complex electronic engines under strong time-to-market pressure. Success will require using appropriate design and process technologies, as well as interconnecting existing components reliably in a plug-and-play fashion. Focusing on using probabilistic metrics such as average values or variance to quantify design objectives such as performance and power will lead to a major change in SoC design methodologies. Overall, these designs will be based on both deterministic and stochastic models. Creating complex SoCs requires a modular, component-based approach to both hardware and software design. Despite numerous challenges, the authors believe that developers will solve the problems of designing SoC networks. At the same time, they believe that a layered micronetwork design methodology will likely be the only path to mastering the complexity of future SoC designs.
3,852 citations
Book•
IBM1
TL;DR: In this article, the authors highlight the intricate interdependencies and subtle tradeoffs between various practically important device parameters, and also provide an in-depth discussion of device scaling and scaling limits of CMOS and bipolar devices.
Abstract: Learn the basic properties and designs of modern VLSI devices, as well as the factors affecting performance, with this thoroughly updated second edition. The first edition has been widely adopted as a standard textbook in microelectronics in many major US universities and worldwide. The internationally-renowned authors highlight the intricate interdependencies and subtle tradeoffs between various practically important device parameters, and also provide an in-depth discussion of device scaling and scaling limits of CMOS and bipolar devices. Equations and parameters provided are checked continuously against the reality of silicon data, making the book equally useful in practical transistor design and in the classroom. Every chapter has been updated to include the latest developments, such as MOSFET scale length theory, high-field transport model, and SiGe-base bipolar devices.
2,680 citations
12 Dec 2009
TL;DR: Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks at the 22nm technology node for both common in-order and out-of-order manycore designs shows that when die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taking into account configuring clusters with 4 cores gives thebest EDA2P and EDAP.
Abstract: This paper introduces McPAT, an integrated power, area, and timing modeling framework that supports comprehensive design space exploration for multicore and manycore processor configurations ranging from 90nm to 22nm and beyond. At the microarchitectural level, McPAT includes models for the fundamental components of a chip multiprocessor, including in-order and out-of-order processor cores, networks-on-chip, shared caches, integrated memory controllers, and multiple-domain clocking. At the circuit and technology levels, McPAT supports critical-path timing modeling, area modeling, and dynamic, short-circuit, and leakage power modeling for each of the device types forecast in the ITRS roadmap including bulk CMOS, SOI, and double-gate transistors. McPAT has a flexible XML interface to facilitate its use with many performance simulators. Combined with a performance simulator, McPAT enables architects to consistently quantify the cost of new ideas and assess tradeoffs of different architectures using new metrics like energy-delay-area2 product (EDA2P) and energy-delay-area product (EDAP). This paper explores the interconnect options of future manycore processors by varying the degree of clustering over generations of process technologies. Clustering will bring interesting tradeoffs between area and performance because the interconnects needed to group cores into clusters incur area overhead, but many applications can make good use of them due to synergies of cache sharing. Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks at the 22nm technology node for both common in-order and out-of-order manycore designs shows that when die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taken into account configuring clusters with 4 cores gives the best EDA2P and EDAP.
2,487 citations
TL;DR: Tunnels based on ultrathin semiconducting films or nanowires could achieve a 100-fold power reduction over complementary metal–oxide–semiconductor transistors, so integrating tunnel FETs with CMOS technology could improve low-power integrated circuits.
Abstract: Power dissipation is a fundamental problem for nanoelectronic circuits. Scaling the supply voltage reduces the energy needed for switching, but the field-effect transistors (FETs) in today's integrated circuits require at least 60 mV of gate voltage to increase the current by one order of magnitude at room temperature. Tunnel FETs avoid this limit by using quantum-mechanical band-to-band tunnelling, rather than thermal injection, to inject charge carriers into the device channel. Tunnel FETs based on ultrathin semiconducting films or nanowires could achieve a 100-fold power reduction over complementary metal-oxide-semiconductor (CMOS) transistors, so integrating tunnel FETs with CMOS technology could improve low-power integrated circuits.
2,390 citations
18 Dec 2006
TL;DR: The parallel landscape is frame with seven questions, and the following are recommended to explore the design space rapidly: • The overarching goal should be to make it easy to write programs that execute efficiently on highly parallel computing systems • The target should be 1000s of cores per chip, as these chips are built from processing elements that are the most efficient in MIPS (Million Instructions per Second) per watt, MIPS per area of silicon, and MIPS each development dollar.
Abstract: Author(s): Asanovic, K; Bodik, R; Catanzaro, B; Gebis, J; Husbands, P; Keutzer, K; Patterson, D; Plishker, W; Shalf, J; Williams, SW | Abstract: The recent switch to parallel microprocessors is a milestone in the history of computing. Industry has laid out a roadmap for multicore designs that preserves the programming paradigm of the past via binary compatibility and cache coherence. Conventional wisdom is now to double the number of cores on a chip with each silicon generation. A multidisciplinary group of Berkeley researchers met nearly two years to discuss this change. Our view is that this evolutionary approach to parallel hardware and software may work from 2 or 8 processor systems, but is likely to face diminishing returns as 16 and 32 processor systems are realized, just as returns fell with greater instruction-level parallelism. We believe that much can be learned by examining the success of parallelism at the extremes of the computing spectrum, namely embedded computing and high performance computing. This led us to frame the parallel landscape with seven questions, and to recommend the following: • The overarching goal should be to make it easy to write programs that execute efficiently on highly parallel computing systems • The target should be 1000s of cores per chip, as these chips are built from processing elements that are the most efficient in MIPS (Million Instructions per Second) per watt, MIPS per area of silicon, and MIPS per development dollar. • Instead of traditional benchmarks, use 13 “Dwarfs” to design and evaluate parallel programming models and architectures. (A dwarf is an algorithmic method that captures a pattern of computation and communication.) • “Autotuners” should play a larger role than conventional compilers in translating parallel programs. • To maximize programmer productivity, future programming models must be more human-centric than the conventional focus on hardware or applications. • To be successful, programming models should be independent of the number of processors. • To maximize application efficiency, programming models should support a wide range of data types and successful models of parallelism: task-level parallelism, word-level parallelism, and bit-level parallelism. 1 The Landscape of Parallel Computing Research: A View From Berkeley • Architects should not include features that significantly affect performance or energy if programmers cannot accurately measure their impact via performance counters and energy counters. • Traditional operating systems will be deconstructed and operating system functionality will be orchestrated using libraries and virtual machines. • To explore the design space rapidly, use system emulators based on Field Programmable Gate Arrays (FPGAs) that are highly scalable and low cost. Since real world applications are naturally parallel and hardware is naturally parallel, what we need is a programming model, system software, and a supporting architecture that are naturally parallel. Researchers have the rare opportunity to re-invent these cornerstones of computing, provided they simplify the efficient programming of highly parallel systems.
2,262 citations