Journal ArticleDOI
Montecito: a dual-core, dual-thread Itanium processor
C. McNairy,Rohit Bhatia +1 more
Reads0
Chats0
TLDR
Intel's Montecito is the first Itanium processor to feature duplicate, dual-thread cores and cache hierarchies on a single die, and it features a landmark 1.72 billion transistors and server-focused technologies.Abstract:
Intel's Montecito is the first Itanium processor to feature duplicate, dual-thread cores and cache hierarchies on a single die. It features a landmark 1.72 billion transistors and server-focused technologies, and it requires only 100 watts of power. Intel's Itanium 2 processor series has regularly delivered additional performance through the increased frequency and cache as evidenced by the 6-Mbyte and 9-Mbyte versions.read more
Citations
More filters
Proceedings ArticleDOI
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0
TL;DR: This work implements two major extensions to the CACTI cache modeling tool that focus on interconnect design for a large cache, and adopts state-of-the-art design space exploration strategies for non-uniform cache access (NUCA).
Proceedings ArticleDOI
An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget
TL;DR: The results show that the best architected policies can come within 1% of the performance of an ideal oracle, while meeting a given chip-level power budget, and are significantly better than static management, even if static scheduling is given oracular knowledge.
Book
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Scott Hauck,André DeHon +1 more
TL;DR: This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology.
Book
Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)
TL;DR: Using OpenMP describes how to use OpenMP in full-scale applications to achieve high performance on large-scale architectures, and describes how OpenMP is translated into explicitly multithreaded code, providing a valuable behind-the-scenes account of OpenMP program performance.
Proceedings ArticleDOI
Use ECP, not ECC, for hard failures in resistive memories
TL;DR: Error-Correcting Pointers (ECP), a new approach to error correction optimized for memories in which errors are the result of permanent cell failures that occur, and are immediately detectable, at write time, provides longer lifetimes than previously proposed solutions with equivalent overhead.
References
More filters
Journal ArticleDOI
The implementation of a 2-core, multi-threaded itanium family processor
S. Naffziger,B. Stackhouse,T. Grutkowski,D. Josephson,Jayen J. Desai,Elad Alon,Mark Horowitz +6 more
TL;DR: The design of the high end server processor code named Montecito incorporated several ambitious goals requiring innovation, including the incorporation of two legacy cores on-die and at the same time reducing power by 23%.
Journal ArticleDOI
Itanium 2 processor microarchitecture
C. McNairy,D. Soltis +1 more
TL;DR: The Itanium 2 processor as discussed by the authors extends the processing power of the Itanium processor family with a capable and balanced microarchitecture. Executing up to six instructions at a time, it provides both performance and binary compatibility for Itaniumbased applications and operating systems.
Journal ArticleDOI
A 90-nm variable frequency clock system for a power-managed itanium architecture processor
TL;DR: In this paper, an Itanium architecture microprocessor in 90-nm CMOS with 1.7B transistors implements a dynamically variable-frequency clock system, which supports a power management scheme which maximizes processor performance within a configured power envelope.
Proceedings ArticleDOI
Effective instruction prefetching in chip multiprocessors for modern commercial applications
TL;DR: This paper proposes an efficient discontinuityPrefetching scheme that can be effectively combined with traditional sequential prefetching to address all forms of instruction cache misses and demonstrates that the combination of the proposed schemes is successful in reducing the instruction miss rate to only 10%-16% of the original miss rate.
Proceedings ArticleDOI
Power and temperature control on a 90nm Itanium/sup /spl reg//-family processor
TL;DR: This paper describes the embedded feedback and control system on a 90 nm Itanium/spl reg/-family processor that maximizes performance while staying within a target power and temperature envelope.