Proceedings ArticleDOI
An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS
Sriram R. Vangal,Jason Howard,G. Ruhl,Saurabh Dighe,H. Wilson,J. Tschanz,D. Finan,P. Iyer,A. Singh,Tiju Jacob,Shailendra Jain,S. Venkataraman,Y. Hoskote,Nitin Borkar +13 more
- pp 98-589
Reads0
Chats0
TLDR
A 275mm2 network-on-chip architecture contains 80 tiles arranged as a 10 times 8 2D array of floating-point cores and packet-switched routers, operating at 4GHz, designed to achieve a peak performance of 1.0TFLOPS at 1V while dissipating 98W.Abstract:
A 275mm2 network-on-chip architecture contains 80 tiles arranged as a 10 times 8 2D array of floating-point cores and packet-switched routers, operating at 4GHz. The 15-F04 design employs mesochronous clocking, fine-grained clock gating, dynamic sleep transistors, and body-bias techniques. The 65nm 100M transistor die is designed to achieve a peak performance of 1.0TFLOPS at 1V while dissipating 98W.read more
Citations
More filters
Proceedings ArticleDOI
Thousand core chips: a technology perspective
TL;DR: The many-core architecture, with hundreds to thousands of small cores, is presented to deliver unprecedented compute performance in an affordable power envelope and fine grain power management, memory bandwidth, on die networks, and system resiliency are discussed.
Proceedings ArticleDOI
The multikernel: a new OS architecture for scalable multicore systems
Andrew Baumann,Paul Barham,Pierre-Évariste Dagand,Tim Harris,Rebecca Isaacs,Simon Peter,Timothy Roscoe,Adrian Schüpbach,Akhilesh Singhania +8 more
TL;DR: This work investigates a new OS structure, the multikernel, that treats the machine as a network of independent cores, assumes no inter-core sharing at the lowest level, and moves traditional OS functionality to a distributed system of processes that communicate via message-passing.
Journal ArticleDOI
Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors
TL;DR: Results confirm the unique benefits for future generations of CMPs that can be achieved by bringing optics into the chip in the form of photonic NoCs, as well as a comparative power analysis of a photonic versus an electronic NoC.
Proceedings ArticleDOI
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0
TL;DR: This work implements two major extensions to the CACTI cache modeling tool that focus on interconnect design for a large cache, and adopts state-of-the-art design space exploration strategies for non-uniform cache access (NUCA).
Journal ArticleDOI
Outstanding Research Problems in NoC Design: System, Microarchitecture, and Circuit Perspectives
TL;DR: This paper provides a general description of NoC architectures and applications and enumerates several related research problems organized under five main categories: Application characterization, communication paradigm, communication infrastructure, analysis, and solution evaluation.
References
More filters
Journal ArticleDOI
Networks on chips: a new SoC paradigm
Luca Benini,G. De Micheli +1 more
TL;DR: Focusing on using probabilistic metrics such as average values or variance to quantify design objectives such as performance and power will lead to a major change in SoC design methodologies.
Journal ArticleDOI
Dynamic-sleep transistor and body bias for active leakage power control of microprocessors
TL;DR: In this paper, the authors used dynamic sleep transistors and body bias to control active leakage for a 32-bit integer execution core in 130-nm CMOS technology in order to manage the active power consumption of high-performance digital designs.
Proceedings ArticleDOI
A 65nm logic technology featuring 35nm gate lengths, enhanced channel strain, 8 Cu interconnect layers, low-k ILD and 0.57 /spl mu/m/sup 2/ SRAM cell
P. Bai,C. Auth,Sridhar Balakrishnan,M. Bost,Ruth A. Brain,V. Chikarmane,R. Heussner,Makarem A. Hussein,Jack Hwang,D. Ingerly,R. James,J. Jeong,C. Kenyon,E. Lee,Seung Hwan Lee,Nick Lindert,Mark Y. Liu,Z. Ma,T. Marieb,Anand Portland Murthy,Ramune Nagisetty,Sanjay Natarajan,J. Neirynck,Andrew Ott,C. Parker,J. Sebastian,R. Shaheed,Swaminathan Sivakumar,Joseph M. Steigerwald,S. Tyagi,Cory E. Weber,Bruce Woolery,Yeoh Andrew W,Kevin Zhang,M. Bohr +34 more
TL;DR: A 65nm generation logic technology with 1.2nm physical gate oxide, 35nm gate length, enhanced channel strain, NiSi, 8 layers of Cu interconnect, and low-k ILD for dense high performance logic is presented in this article.
Journal ArticleDOI
A 6.2-GFlops Floating-Point Multiply-Accumulator With Conditional Normalization
TL;DR: A pipelined single-precision floating-point multiply-accumulator (FPMAC) featuring a single-cycle accumulate loop using base 32 and internal carry-save arithmetic with delayed addition is described, allowing removal of the costly normalization step from the critical accumulate loop.
Proceedings ArticleDOI
A 4.2GHz 0.3mm2 256kb Dual-V/sub cc/ SRAM Building Block in 65nm CMOS
Muhammad M. Khellah,Nam Sung Kim,Jason Howard,G. Ruhl,M. Sunna,Yibin Ye,J. Tschanz,Dinesh Somasekhar,Nitin Borkar,F. Hamzaoglu,Gunjan H. Pandya,A. Farhang,Kevin Zhang,Vivek De +13 more
TL;DR: An SRAM macro, implemented in a 65nm CMOS process, uses a dual supply to maximize density while enabling the use of low voltage for the processor core.