scispace - formally typeset
Journal ArticleDOI

Cell broadband engine architecture and its first implementation: a performance view

Reads0
Chats0
TLDR
It is shown that the Cell/B.E.E., or Cell Broadband Engine, processor can outperform other modern processors by approximately an order of magnitude and by even more in some cases.
Abstract
The Cell Broadband Engine™ (Cell/B.E.) processor is the first implementation of the Cell Broadband Engine Architecture (CBEA), developed jointly by Sony, Toshiba, and IBM. In addition to use of the Cell/B.E. processor in the Sony Computer Entertainment PLAYSTATION® 3 system, there is much interest in using it for workstations, media-rich electronics devices, and video and image processing systems. The Cell/B.E. processor includes one PowerPC® processor element (PPE) and eight synergistic processor elements (SPEs). The CBEA is designed to be well suited for a wide variety of programming models, and it allows for partitioning of work between the PPE and the eight SPEs. In this paper we show that the Cell/B.E. processor can outperform other modern processors by approximately an order of magnitude and by even more in some cases.

read more

Citations
More filters
Proceedings ArticleDOI

The potential of the cell processor for scientific computing

TL;DR: This work introduces a performance model for Cell and applies it to several key scientific computing kernels: dense matrix multiply, sparse matrix vector multiply, stencil computations, and 1D/2D FFTs, and proposes modest microarchitectural modifications that could significantly increase the efficiency of double-precision calculations.
Proceedings ArticleDOI

Entering the petaflop era: the architecture and performance of Roadrunner

TL;DR: A detailed architectural description of Roadrunner and a detailed performance analysis of the system are presented and a case study of optimizing the MPI-based application Sweep3D to exploit Roadrunner's hybrid architecture is also included.
Journal ArticleDOI

State-of-the-art in heterogeneous computing

TL;DR: In this paper, the authors present an overview of the state-of-the-art in heterogeneous computing, focusing on three commonly found architectures: the Cell Broadband Engine Architecture, graphics processing units (GPUs), and field programmable gate arrays (FPGAs).

Место и роль инновационных технологий на уроках математики

TL;DR: In this paper, the authors proposed a method to compute the probability of a given node having a negative value for a given value of 0, i.e., a node having no negative value is 0.
Proceedings ArticleDOI

CudaDMA: optimizing GPU memory bandwidth via warp specialization

TL;DR: This work proposes an approach for programming GPUs with tightly-coupled specialized DMA warps for performing memory transfers between on-chip and off-chip memories, and presents an extensible API, CudaDMA, that encapsulates synchronization and common sequential and strided data transfer patterns.
References
More filters
Journal ArticleDOI

A Fast Computational Algorithm for the Discrete Cosine Transform

TL;DR: A Fast Discrete Cosine Transform algorithm has been developed which provides a factor of six improvement in computational complexity when compared to conventional DiscreteCosine Transform algorithms using the Fast Fourier Transform.
Journal ArticleDOI

Introduction to the cell multiprocessor

TL;DR: This paper discusses the history of the project, the program objectives and challenges, the disign concept, the architecture and programming models, and the implementation of the Cell multiprocessor.
Journal ArticleDOI

Synergistic Processing in Cell's Multicore Architecture

TL;DR: The streamlined architecture provides an efficient multithreaded execution environment for both scalar and SIMD threads and represents a reaffirmation of the RISC principles of combining leading edge architecture and compiler optimizations.
Proceedings ArticleDOI

Optimizing Compiler for the CELL Processor

TL;DR: Several compiler techniques that aim at automatically generating high quality codes over a wide range of heterogeneous parallelism available on the CELL processor are described and results indicate that significant speedup can be achieved with a high level of support from the compiler.
Journal ArticleDOI

Software libraries for linear algebra computations on high performance computers

Jack Dongarra, +1 more
- 01 Jun 1995 - 
TL;DR: This paper discusses the design of linear algebra libraries for high performance computers, with particular emphasis on the development of scalable algorithms for multiple instruction multiple data (MIMD) distributed memory concurrent computers.
Related Papers (5)