Charles D. Wait

Researcher at IBM

Publications - 34

Citations - 1266

Charles D. Wait is an academic researcher from IBM. The author has contributed to research in topics: Addressing mode & Instruction register. The author has an hindex of 13, co-authored 34 publications receiving 1259 citations.

Papers

PDF

Open Access

More filters

Proceedings ArticleDOI

An Overview of the BlueGene/L Supercomputer

N. R. Adiga, +114 more

TL;DR: An overview of the BlueGene/L Supercomputer, a massively parallel system of 65,536 nodes based on a new architecture that exploits system-on-a-chip technology to deliver target peak processing power of 360 teraFLOPS (trillion floating-point operations per second).

...read moreread less

Patent

Multi-petascale highly efficient parallel supercomputer

Sameh W. Asaad, +60 more

TL;DR: A multi-petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, allows for a maximum packaging density of processing nodes from an interconnect point of view.

...read moreread less

Patent

Structural Power Reduction in Multithreaded Processor

Stephen Joseph Schwinn, +2 more

TL;DR: In this article, a circuit arrangement and method utilize a plurality of execution units having different power and performance characteristics and capabilities within a multithreaded processor core, and selectively route instructions having different performance requirements to different execution units.

...read moreread less

Patent

Floating point execution unit with fixed point functionality

Mark J. Hickey, +3 more

TL;DR: A floating point execution unit is capable of selectively repurposing one or more adders in an exponent path of the floating point unit to perform fixed point addition operations, thereby providing fixed point functionality in the floating-point execution unit as mentioned in this paper.

...read moreread less

Proceedings ArticleDOI

A High-Performance SIMD Floating Point Unit for BlueGene/L: Architecture, Compilation, and Algorithm Design

Leonardo Bachega, +10 more

TL;DR: Preliminary performance data shows that the algorithm-compiler-hardware combination delivers a significant fraction of peak floating-point performance for compute-bound kernels such as matrix multiplication, and delivery of peak memory bandwidth for memory-bound kernel such as daxpy, while being largely insensitive to data alignment.

...read moreread less

Collapse