Author

D. Weisner

Bio: D. Weisner is an academic researcher. The author has contributed to research in topics: Register file & Low-power electronics. The author has an hindex of 1, co-authored 1 publications receiving 144 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

A Power-Efficient High-Throughput 32-Thread SPARC Processor

[...]

Ana Sonia Leon, Jinuk Luke Shin, K.W. Tam, W. Bryg, Francis Schumacher, P. Kongetira, D. Weisner, A. Strong - Show less +4 more

18 Sep 2006

TL;DR: The first generation of Niagara SPARC processors implements a power-efficient multi-threading architecture to achieve high throughput with minimum hardware complexity.

...read moreread less

Abstract: This first generation of "Niagara" SPARC processors implements a power-efficient Chip Multi-Threading (CMT) architecture which maximizes overall throughput performance for commercial workloads. The target performance is achieved by exploiting high bandwidth rather than high frequency, thereby reducing hardware complexity and power. The UltraSPARC T1 processor combines eight four-threaded 64-b cores, a floating-point unit, a high-bandwidth interconnect crossbar, a shared 3-MB L2 Cache, four DDR2 DRAM interfaces, and a system interface unit. Power and thermal monitoring techniques further enhance CMT performance benefits, increasing overall chip reliability. The 378-mm2 die is fabricated in Texas Instrument's 90-nm CMOS technology with nine layers of copper interconnect. The chip contains 279 million transistors and consumes a maximum of 63 W at 1.2 GHz and 1.2 V. Key functional units employ special circuit techniques to provide the high bandwidth required by a CMT architecture while optimizing power and silicon area. These include a highly integrated integer register file, a high-bandwidth interconnect crossbar, the shared L2 cache, and the IO subsystem. Key aspects of the physical design methodology are also discussed

...read moreread less

144 citations

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures

[...]

Sheng Li¹, Jung Ho Ahn², Richard Strong³, Jay B. Brockman¹, Dean M. Tullsen³, Norman P. Jouppi⁴ - Show less +2 more•Institutions (4)

University of Notre Dame¹, Seoul National University², University of California, San Diego³, Hewlett-Packard⁴

12 Dec 2009

TL;DR: Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks at the 22nm technology node for both common in-order and out-of-order manycore designs shows that when die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taking into account configuring clusters with 4 cores gives thebest EDA2P and EDAP.

...read moreread less

Abstract: This paper introduces McPAT, an integrated power, area, and timing modeling framework that supports comprehensive design space exploration for multicore and manycore processor configurations ranging from 90nm to 22nm and beyond. At the microarchitectural level, McPAT includes models for the fundamental components of a chip multiprocessor, including in-order and out-of-order processor cores, networks-on-chip, shared caches, integrated memory controllers, and multiple-domain clocking. At the circuit and technology levels, McPAT supports critical-path timing modeling, area modeling, and dynamic, short-circuit, and leakage power modeling for each of the device types forecast in the ITRS roadmap including bulk CMOS, SOI, and double-gate transistors. McPAT has a flexible XML interface to facilitate its use with many performance simulators. Combined with a performance simulator, McPAT enables architects to consistently quantify the cost of new ideas and assess tradeoffs of different architectures using new metrics like energy-delay-area2 product (EDA2P) and energy-delay-area product (EDAP). This paper explores the interconnect options of future manycore processors by varying the degree of clustering over generations of process technologies. Clustering will bring interesting tradeoffs between area and performance because the interconnects needed to group cores into clusters incur area overhead, but many applications can make good use of them due to synergies of cache sharing. Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks at the 22nm technology node for both common in-order and out-of-order manycore designs shows that when die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taken into account configuring clusters with 4 cores gives the best EDA2P and EDAP.

...read moreread less

2,487 citations

Journal Article•DOI•

Niagara: a 32-way multithreaded Sparc processor

[...]

P. Kongetira¹, Kathirgamar Aingaran¹, Kunle Olukotun¹•Institutions (1)

Sun Microsystems¹

01 Mar 2005-IEEE Micro

TL;DR: The Niagara processor implements a thread-rich architecture designed to provide a high-performance solution for commercial server applications that exploits the thread-level parallelism inherent to server applications, while targeting low levels of power consumption.

...read moreread less

Abstract: The Niagara processor implements a thread-rich architecture designed to provide a high-performance solution for commercial server applications. This is an entirely new implementation of the Sparc V9 architectural specification, which exploits large amounts of on-chip parallelism to provide high throughput. The hardware supports 32 threads with a memory subsystem consisting of an on-board crossbar, level-2 cache, and memory controllers for a highly integrated design that exploits the thread-level parallelism inherent to server applications, while targeting low levels of power consumption.

...read moreread less

1,053 citations

Journal Article•DOI•

Multiprocessor System-on-Chip (MPSoC) Technology

[...]

Wayne Wolf¹, Ahmed Amine Jerraya¹, G. Martin•Institutions (1)

Georgia Institute of Technology¹

01 Oct 2008-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: The history of MPSoCs is surveyed to argue that they represent an important and distinct category of computer architecture and to survey computer-aided design problems relevant to the design of MP soCs.

...read moreread less

Abstract: The multiprocessor system-on-chip (MPSoC) uses multiple CPUs along with other hardware subsystems to implement a system. A wide range of MPSoC architectures have been developed over the past decade. This paper surveys the history of MPSoCs to argue that they represent an important and distinct category of computer architecture. We consider some of the technological trends that have driven the design of MPSoCs. We also survey computer-aided design problems relevant to the design of MPSoCs.

...read moreread less

435 citations

Journal Article•DOI•

Toward Dark Silicon in Servers

[...]

Nikos Hardavellas¹, Michael Ferdman², Babak Falsafi³, Anastasia Ailamaki³•Institutions (3)

Northwestern University¹, Carnegie Mellon University², École Polytechnique Fédérale de Lausanne³

01 Jul 2011-IEEE Micro

TL;DR: Server chips will not scale beyond a few tens to low hundreds of cores, and an increasing fraction of the chip in future technologies will be dark silicon that the authors cannot afford to power.

...read moreread less

Abstract: Server chips will not scale beyond a few tens to low hundreds of cores, and an increasing fraction of the chip in future technologies will be dark silicon that we cannot afford to power. Specialized multicore processors, however, can leverage the underutilized die area to overcome the initial power barrier, delivering significantly higher performance for the same bandwidth and power envelopes.

...read moreread less

266 citations

Proceedings Article•DOI•

Temperature aware task scheduling in MPSoCs

[...]

Ayse K. Coskun¹, Tajana Rosing¹, Keith Whisnant¹•Institutions (1)

University of California, San Diego¹

16 Apr 2007

TL;DR: This work design and evaluate OS-level dynamic scheduling policies with negligible performance overhead, and shows that, using simple to implement policies that make decisions based on temperature measurements, better temporal and spatial thermal profiles can be achieved in comparison to state-of-art schedulers.

...read moreread less

Abstract: In deep submicron circuits, elevation in temperatures has brought new challenges in reliability, timing, performance, cooling costs and leakage power. Conventional thermal management techniques sacrifice performance to control the thermal behavior by slowing down or turning off the processors when a critical temperature threshold is exceeded. Moreover, studies have shown that in addition to high temperatures, temporal and spatial variations in temperature impact system reliability. In this work, we explore the benefits of thermally aware task scheduling for multiprocessor systems-on-a-chip (MPSoC). We design and evaluate OS-level dynamic scheduling policies with negligible performance overhead. We show that, using simple to implement policies that make decisions based on temperature measurements, better temporal and spatial thermal profiles can be achieved in comparison to state-of-art schedulers. We also enhance reactive strategies such as dynamic thread migration with our scheduling policies. This way, hot spots and temperature variations are decreased, and the performance cost is significantly reduced.

...read moreread less

240 citations

Collapse