scispace - formally typeset
Search or ask a question
Topic

PowerPC

About: PowerPC is a research topic. Over the lifetime, 1184 publications have been published within this topic receiving 22297 citations. The topic is also known as: ppc.


Papers
More filters
Book ChapterDOI
24 Feb 2011
TL;DR: A heterogeneous multi-processor architecture, taking into consideration critical parameters such as cache bandwidth and memory latency, is focused on, with results suggesting that for the Sparse Matrix Vector Multiplication kernel, absolute speedups of up to 200 times can be obtained.
Abstract: We evaluate the scalability of a Polymorphic Register File using the Conjugate Gradient method as a case study. We focus on a heterogeneous multi-processor architecture, taking into consideration critical parameters such as cache bandwidth and memory latency. We compare the performance of 256 Polymorphic Register File-augmented workers against a single Cell PowerPC Processor Unit (PPU). In such a scenario, simulation results suggest that for the Sparse Matrix Vector Multiplication kernel, absolute speedups of up to 200 times can be obtained. Moreover, when equal number of workers in the range 1-256 is employed, our design is between 1.7 and 4.2 times faster than a Cell PPU-based system. Furthermore, we study the memory latency and cache bandwidth impact on the sustainable speedups of the system considered. Our tests suggest that a 128 worker configuration requires the caches to deliver 1638.4 GB/sec in order to preserve 80% of its peak speedup.

12 citations

Proceedings ArticleDOI
05 Mar 2011
TL;DR: Software-based fault tolerance strategies for PowerPC devices embedded within Xilinx Virtex 4 FX60 FPGAs are described, using heartbeat monitoring, control flow assertions, and checkpoint/rollback to achieve high performance and low overhead fault tolerance.
Abstract: In this paper we describe our software-based fault tolerance strategies for PowerPC devices embedded within Xilinx Virtex 4 FX60 FPGAs. Traditional FPGA fault tolerance techniques, such as scrubbing and TMR, cannot be applied to the embedded PowerPC. Our work targets scientific applications operating on space-based FPGA architectures consisting of an FPGA and a radiation-hardened controller. We use heartbeat monitoring, control flow assertions, and checkpoint/rollback to achieve high performance and low overhead fault tolerance. Our initial results show we are able to add our fault tolerance strategies with only 2% application overhead while recovering from 94% of the faults injected during testing.12

12 citations

Proceedings ArticleDOI
30 Aug 2009
TL;DR: This paper presents a direct implementation of delimited control operators shift and reset in the MinCaml compiler and shows all the details of how composable continuations can be implemented in the PowerPC microprocessor using a stack discipline.
Abstract: Although delimited control operators are becoming one of the useful tools to manipulate flow of programs, their direct and compiled implementation in a low-level language has not been proposed so far. The only direct and low-level implementations available are Gasbichler and Sperber's implementation in the Scheme 48 virtual machine and Kiselyov's implementation in the OCaml bytecode. Even though these implementations do provide insight into how stack frames are composed, they are not directly portable to compiled implementation in assembly language. This paper presents a direct implementation of delimited control operators shift and reset in the MinCaml compiler. It shows all the details of how composable continuations can be implemented in the PowerPC microprocessor using a stack discipline. We also show an implementation that copies stack frames lazily. To our knowledge, this is the first implementation of shift/reset in assembly language. It makes clear at the assembly language level what we have informally described so far, such as "copying and composing stack frames" and "inserting a reset mark when captured continuations are called". We demonstrate various benchmarks to show the performance of our implementation and discuss its pros and cons.

12 citations

01 Jan 2003
TL;DR: The SPACE algorithm supports precise exceptions, but in an improvement over the previous work, eliminates the need for most hardware register commit operations, which are used to place values in their original program location in the original program sequence.
Abstract: We describe the SPACE algorithm for translating from one architecture such as PowerPC into operations for another architecture such as VLIW, while also supporting scheduling, register al- location, and other optimizations. Our SPACE algorithm supports precise exceptions, but in an improvement over our previous work, eliminates the need for most hardware register commit op- erations, which are used to place values in their original program location in the original program sequence. The elimination of commit operations frees issue slots for other computation, a feature that is especially important for narrower machines. The SPACE algorithm is efficient, running in O(N) time in the number N of operations in the worst case, but in practice is closer to a two-pass O(N) algorithm. The fact that our approach provides precise exceptions with low overhead is useful to program- ming language designers as well — exception models in which an exception can occur at almost any instruction are not prohibitively expensive.

12 citations

15 Oct 2001
TL;DR: The Integrated Computer Control System (ICCS) for the National Ignition Facility (NIF) is a layered architecture of 300 front-end processors coordinated by supervisor subsystems including automatic beam alignment and wavefront control, laser and target diagnostics, pulse power, and shot control timed to 30 ps.
Abstract: The Integrated Computer Control System (ICCS) for the National Ignition Facility (NIF) is a layered architecture of 300 front-end processors (FEP) coordinated by supervisor subsystems including automatic beam alignment and wavefront control, laser and target diagnostics, pulse power, and shot control timed to 30 ps. FEP computers incorporate either VxWorks on PowerPC or Solaris on UltraSPARC processors that interface to over 45,000 control points attached to VME-bus or PCI-bus crates respectively. Typical devices are stepping motors, transient digitizers, calorimeters, and photodiodes. The front-end layer is divided into another segment comprised of an additional 14,000 control points for industrial controls including vacuum, argon, synthetic air, and safety interlocks implemented with Allen-Bradley programmable logic controllers (PLCs). The computer network is augmented asynchronous transfer mode (ATM) that delivers video streams from 500 sensor cameras monitoring the 192 laser beams to operator workstations. Software is based on an object-oriented framework using CORBA distribution that incorporates services for archiving, machine configuration, graphical user interface, monitoring, event logging, scripting, alert management, and access control. Software coding using a mixed language environment of Ada95 and Java is one-third complete at over 300 thousand source lines. Control system installation is currently under way for the first 8 beams, with project completion scheduled for 2008.

12 citations


Network Information
Related Topics (5)
Scalability
50.9K papers, 931.6K citations
77% related
CMOS
81.3K papers, 1.1M citations
77% related
Software
130.5K papers, 2M citations
77% related
Integrated circuit
82.7K papers, 1M citations
76% related
Cache
59.1K papers, 976.6K citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20232
20226
20215
20208
201916
201823