scispace - formally typeset
Search or ask a question

Showing papers on "PowerPC published in 2003"


Journal ArticleDOI
28 Jul 2003
TL;DR: The designs of WCET tools for a series of increasingly complex processors, including SuperSPARC, Motorola ColdFire 5307, and Motorola PowerPC 755, are described, and some advice is given as to the predictability of processor architectures.
Abstract: The architecture of tools for the determination of worst case execution times (WCETs) as well as the precision of the results of WCET analyses strongly depend on the architecture of the employed processor. The cache replacement strategy influences the results of cache behavior prediction; out-of-order execution and control speculation introduce interferences between processor components, e.g., caches, pipelines, and branch prediction units. These interferences forbid modular designs of WCET tools, which would execute the subtasks of WCET analysis consecutively. Instead, complex integrated designs are needed, resulting in high demand for memory space and analysis time. We have implemented WCET tools for a series of increasingly complex processors: SuperSPARC, Motorola ColdFire 5307, and Motorola PowerPC 755. In this paper, we describe the designs of these tools, report our results and the lessons learned, and give some advice as to the predictability of processor architectures.

272 citations


Bishop Brock1, Karthick Rajamani1
01 Jan 2003
TL;DR: This paper discusses several of the SOC design issues pertaining to dynamic voltage and frequency scalable systems, and how these issues were resolved in the IBM PowerPC 405LP processor, and introduces DPM, a novel architecture for policy-guided dynamic power management.
Abstract: This paper discusses several of the SOC design issues pertaining to dynamic voltage and frequency scalable systems, and how these issues were resolved in the IBM PowerPC 405LP processor. We also introduce DPM, a novel architecture for policy-guided dynamic power management. We illustrate the utility of DPM by its ability to implement several classes of power management strategies and demonstrate practical results for a 405LP embedded system. I. INTRODUCTION Advances in low-power components and system design have brought general purpose computation into watches, wireless telephones, PDAs and tablet computers. Power management of these systems has traditionally focused on sleep modes and device power management (1). Embedded processors for these applications are highly integrated system-on-a-chip (SOC) de- vices that also support aggressive power management through techniques such as programmable clock gating and dynamic voltage and frequency scaling (DVFS). This paper describes one of these processors, and the development of a software architecture for policy-guided dynamic power management. II. 405LP DESIGN AND POWER MANAGEMENT FEATURES The IBM PowerPC 405LP is a dynamic voltage and frequency scalable embedded processor targeted at high- performance battery-operated devices. The 405LP is an SOC ASIC design in a 0.18 m bulk CMOS process, integrating a PowerPC 405 CPU core modified for operation over a 1.0 V to 1.8 V range with off-the-shelf IP cores. The chip includes a flexible clock generation subsystem, new hardware accelerators for speech recognition and security, as well as a novel standby power management controller (2). In a system we normally operate the CPU/SDRAM at 266/133 MHz above 1.65 V and at 66/33 MHz above 0.9 V, typically providing a 13:1 SOC core power range over the 4:1 performance range. From a system design and active power management perspec- tive the most interesting facets of the 405LP SOC design concern the way the clocks are generated and controlled. These features of the processor are described in the remainder of this Section.

129 citations


Journal ArticleDOI
TL;DR: An escape analysis framework for Java is presented to determine (1) if an object is not reachable after its method of creation returns, allowing the object to be allocated on the stack, and (2) if a object is reachable only from a single thread during its lifetime, allowing unnecessary synchronization operations on that object to been removed.
Abstract: This article presents an escape analysis framework for Java to determine (1) if an object is not reachable after its method of creation returns, allowing the object to be allocated on the stack, and (2) if an object is reachable only from a single thread during its lifetime, allowing unnecessary synchronization operations on that object to be removed. We introduce a new program abstraction for escape analysis, the connection graph, that is used to establish reachability relationships between objects and object references. We show that the connection graph can be succinctly summarized for each method such that the same summary information may be used in different calling contexts without introducing imprecision into the analysis. We present an interprocedural algorithm that uses the above property to efficiently compute the connection graph and identify the nonescaping objects for methods and threads. The experimental results, from a prototype implementation of our framework in the IBM High Performance Compiler for Java, are very promising. The percentage of objects that may be allocated on the stack exceeds 70p of all dynamically created objects in the user code in three out of the ten benchmarks (with a median of 19p); 11p to 92p of all mutex lock operations are eliminated in those 10 programs (with a median of 51p), and the overall execution time reduction ranges from 2p to 23p (with a median of 7p) on a 333-MHz PowerPC workstation with 512 MB memory.

123 citations


Proceedings ArticleDOI
23 Feb 2003
TL;DR: Techniques for energy-efficient design at the algorithm level using FPGA are presented and it is shown that FPGAs can achieve this performance while still dissipating less energy than the other two types of devices.
Abstract: In this paper, we present techniques for energy-efficient design at the algorithm level using FPGAs. We then use these techniques to create energy-efficient designs for two signal processing kernel applications: fast Fourier transform (FFT) and matrix multiplication. We evaluate the performance, in terms of both latency and energy efficiency, of FPGAs in performing these tasks. Using a Xilinx Virtex-II as the target FPGA, we compare the performance of our designs to those from the Xilinx library as well as to conventional algorithms run on the PowerPC core embedded in the Virtex-II Pro and the Texas Instruments TMS320C6415. Our evaluations are done both through estimation based on energy and latency equations and through low-level simulation. For FFT, our designs dissipated an average of 60% less energy than the design from the Xilinx library and 56% less than the DSP. Our designs showed a factor of 10 improvement over the embedded processor. These results provide concrete evidence to substantiate the idea that FPGAs can outperform DSPs and embedded processors in signal processing. Further, they show that FPGAs can achieve this performance while still dissipating less energy than the other two types of devices.

79 citations


Proceedings ArticleDOI
01 May 2003
TL;DR: Evaluating the performance of processors that implement new microprocessor architectures that implement processor-in- memory, stream processing, and tiled processing shows that these new processors show significant improvements over conventional systems and that each architecture has its own strengths and weaknesses.
Abstract: Trends in microprocessors of increasing die size and clock speed and decreasing feature sizes have fueled rapidly increasing performance. However, the limited improvements in DRAM latency and bandwidth and diminishing returns of increasing superscalar ILP and cache sizes have led to the proposal of new microprocessor architectures that implement processor-in- memory, stream processing, and tiled processing. Each architecture is typically evaluated separately and compared to a baseline architecture. In this paper, we evaluate the performance of processors that implement these architectures on a common set of signal processing kernels.The implementation results are compared with the measured performance of a conventional system based on the PowerPC with Altivec. The results show that these new processors show significant improvements over conventional systems and that each architecture has its own strengths and weaknesses.

68 citations


Proceedings ArticleDOI
03 Dec 2003
TL;DR: It is shown that the most important factor in reduction of execution time is cache size (both instruction and data cache), and some of the performance gain of advanced processor features also applies to the worst case and although WCET estimates may be more pessimistic the overall impact is that they result in lowerWCET estimates.
Abstract: This paper presents a quantification of the timing effects that advanced processor features like data and instruction cache, pipelines, branch prediction units, and out-of-order execution units have on the worst-case execution time (WCET) of programs. These features are present in processors (e.g. PowerPC) that are being widely used in embedded and real-time systems. We present an experimental evaluation of the execution time of a series of synthetic benchmarks and real-life case studies. The execution time is evaluated using extensive testing and a simple WCET technique. We show that the most important factor in reduction of execution time is cache size (both instruction and data cache). Other factors like branch prediction and out-of-order execution have minimal improvements that are cancelled out by the pessimism of the analysis. We also argue that some of the performance gain of advanced processor features also applies to the worst case and although WCET estimates may be more pessimistic the overall impact is that they result in lower WCET estimates.

58 citations


Proceedings ArticleDOI
B. Brock1, K. Rajamani1
03 Nov 2003
TL;DR: This paper discusses several of the SOC design issues pertaining to dynamic voltage and frequency scalable systems, and how these issues were resolved in the IBM PowerPC 405LP processor and introduces DPM, a novel architecture for policy-guided dynamic power management.
Abstract: This paper discusses several of the SOC design issues pertaining to dynamic voltage and frequency scalable systems, and how these issues were resolved in the IBM PowerPC 405LP processor We also introduce DPM, a novel architecture for policy-guided dynamic power management We illustrate the utility of DPM by its ability to implement several classes of power management strategies and demonstrate practical results for a 405LP embedded system

58 citations


Journal ArticleDOI
Hazim Shafi1, Patrick J. Bohrer1, J. Phelan1, C. A. Rusu1, James L. Peterson1 
TL;DR: The design and validation of a performance and power simulator that is part of the Mambo simulation environment for PowerPC® systems, designated as Tempo, and examples of how well it can predict the runtime power consumption of a 405GP microprocessor during application execution are shown.
Abstract: This paper describes the design and validation of a performance and power simulator that is part of the Mambo simulation environment for PowerPC® systems. One of the most notable features of the simulator, designated as Tempo, is the incorporation of an event-driven power model. Tempo satisfies an important need for fast and accurate performance and power simulation tools at the system level. The power and performance predictions from the simulated model of a PowerPC 405GP (or simply 405GP) were validated against a 405GP-based evaluation board instrumented for power measurements using 42 application/dataset combinations from the EEMBC benchmark suite. The average performance and energy-prediction errors were 0.6% and -4.1%, respectively. In addition to describing Tempo, we show examples of how well it can predict the runtime power consumption of a 405GP microprocessor during application execution.

50 citations


Journal ArticleDOI
TL;DR: A generic framework for defining instructions, programs, and the semantics of their instantiation by operations in a multiprocessor environment that allows an architect to reveal the programming view induced by a shared-memory architecture and guides architecture-level verification.
Abstract: This paper introduces a generic framework for defining instructions, programs, and the semantics of their instantiation by operations in a multiprocessor environment. The framework captures information flow between operations in a multiprocessor program by means of a reads-from mapping from read operations to write operations. Two fundamental relations are defined on the operations: a program order between operations which instantiate the program of some processor and view orders which are specific to each shared memory model. An operation cannot read from the "hidden" pastor from the future; the future and the past causality can be examined either relative to the program order or relative to the view orders. A shared memory model specifies, for a given program, the permissible transformation of resource states. The memory model should reflect the programmer's view by citing the guaranteed behavior of the multiprocessor in the interface visible to the programmer. The model should retrain from dictating the design practices that should be followed by the implementation. Our framework allows an architect to reveal the programming view induced by a shared-memory architecture; it serves programmers exploring the limits of the programming interface and guides architecture-level verification. The framework is applicable for complex, commercial architectures as it can capture subtle programming-interface details, exposing the underlying aggressive microarchitecture mechanisms. As an illustration, we define the shared memory model supported by the PowerPC architecture, within our framework.

48 citations


Proceedings ArticleDOI
26 Oct 2003
TL;DR: The system overview of the Java Just-In-Time (JIT) compiler is described, which is the basis for the latest production version of IBM Java JIT compiler that supports a diversity of processor architectures including both 32-bit and 64-bit modes, CISC, RISC, and VLIW architectures.
Abstract: This paper describes the system overview of our Java Just-In-Time (JIT) compiler, which is the basis for the latest production version of IBM Java JIT compiler that supports a diversity of processor architectures including both 32-bit and 64-bit modes, CISC, RISC, and VLIW architectures. In particular, we focus on the design and evaluation of the cross-platform optimizations that are common across different architectures. We studied the effectiveness of each optimization by selectively disabling it in our JIT compiler on three different platforms: IA-32, IA-64, and PowerPC. Our detailed measurements allowed us to rank the optimizations in terms of the greatest performance improvements with the smallest compilation times. The identified set includes method inlining only for tiny methods, exception check eliminations using forward dataflow analysis and partial redundancy elimination, scalar replacement for instance and class fields using dataflow analysis, optimizations for type inclusion checks, and the elimination of merge points in the control flow graphs. These optimizations can achieve 90% of the peak performance for two industry-standard benchmark programs on these platforms with only 34% of the compilation time compared to the case for using all of the optimizations.

47 citations


01 Jan 2003
TL;DR: The transistors that contribute to the leakage power in each SRAM sub-circuit are identified as a function of the operation (read/write/idle) on the SRAM and parameterized leakage power models are developed in terms of the high level design parameters and transistor widths.
Abstract: In this paper we propose analytical models for estimating the leakage power in CMOS based SRAM designs. We identify the transistors that contribute to the leakage power in each SRAM sub-circuit as a function of the operation (read/write/idle) on the SRAM and develop parameterized leakage power models in terms of the high level design parameters and transistor widths. The models take number of rows, number of columns, read column multiplexer size and write column multiplexer size of the SRAM along with the technology parameters as input to estimate the leakage power. The developed models are validated by comparing their estimates against the power measured using SPICE simulations on industrial SRAM designs belonging to the e5001 processor core. The comparison shows that the models are highly accurate with an error margin of less than 23.9%. ∗This work was done in collaboration with Motorola corporation e500 is the Motorola processor core that is compliant with the PowerPC Book E architecture

Proceedings ArticleDOI
03 Mar 2003
TL;DR: A streaming method to partition real-time software into parts which can be transmitted (streamed) to the embedded device and shows a robotics application that without the streaming method is unable to meet its real- time deadline, but with the method, the application is able to met its deadline.
Abstract: Software streaming allows the execution of stream-enabled software on a device even while the transmission/streaming may still be in progress. Thus, the software can be executed while it is being streamed instead of causing the user to wait for the completion of download, decompression, installation and reconfiguration. Our streaming method can reduce application load time seen by the user since the application can start running as soon as the first executable unit is loaded into the memory. Furthermore, unneeded parts of the application might not be downloaded to the device. As a result, resource utilization such as memory and bandwidth usage may also be more efficient. Using our streaming method, an embedded device can support a wide range of realtime applications. The applications can be run on demand. In this paper, a streaming method we call block streaming is proposed. Block streaming is determined at the assembly code level. We implemented a tool to partition real-time software into parts which can be transmitted (streamed) to the embedded device. Our streaming method was implemented and simulated on a hardware-software co-simulation platform in which we used the PowerPC architecture. We show a robotics application that without our streaming method is unable to meet its real-time deadline. However, with our software streaming method, the application is able to meet its deadline. The application load time for this application also improves by a factor of more than 10/spl times/ when compared to downloading the entire application before running it.

Journal ArticleDOI
TL;DR: The organization of the SoC design is described, the capabilities provided in the design to match the performance and power consumption with the need of the application are described, and measured results for the PowerPC 405LP processor are presented.
Abstract: The PowerPC® 405LP system-on-a-chip (SoC) processor, which was developed for high-content, battery-powered application space, provides dynamic voltage-scaling and on-the-fly frequency-scaling capabilities that allow the system and applications to adapt to changes in their performance demands and power constraints during operation. The 405LP operates over a voltage supply range of 1.95 to 0.9 V with a range of power efficiencies of 1.0 to 3.9 MIPS/mW when executing the Dhrystone benchmark. Operating system and application software support allow the applications to take full advantage of the energy-efficiency capabilities of the SoC. This paper describes the organization of the SoC design, details the capabilities provided in the design to match the performance and power consumption with the need of the application, describes how these capabilities are employed, and presents measured results for the PowerPC 405LP processor.

Proceedings ArticleDOI
09 Nov 2003
TL;DR: A methodology and a tool that model power dissipation in SRAM-based arrays accurately based on a high-level description of the array, and has been validated on industrial designs across a wide variety of array implementations in the e500 processor core.
Abstract: While array structures are a significant source of power dissipation, there is a lack of accurate high-level power estimators that account for varying array circuit implementation styles. We present a methodology and a tool, the implementation-dependent array power (IDAP) estimator, that model power dissipation in SRAM-based arrays accurately based on a high-level description of the array. The models are parameterized by the array operations and various technology dependent parameters. The methodology is generic and the IDAP tool has been validated on industrial designs across a wide variety of array implementations in the e500 processor core (e500 is the Motorola processor core that is compliant with the PowerPC Book E architecture). For these industrial designs, IDAP generates high-level estimates for dynamic power dissipation that are accurate with an error margin of less than 22.2% of detailed (layout extracted) SPICE simulations. We apply the tool in three different scenarios: 1) identifying the subblocks that contribute to power significantly; 2) evaluating the effect of bitline-voltage swing on array power; and 3) evaluating the effect of memory bit-cell dimensions on array power.

Journal Article
TL;DR: The design aspects of instruction arbitration in an ρμ-coded CCM are discussed and a complete design of such an arbiter is proposed and its VHDL code is synthesized for the VirtexII Pro platform FPGA of Xilinx.
Abstract: In this paper, the design aspects of instruction arbitration in an ρμ-coded CCM are discussed Software considerations, architectural solutions, implementation issues and functional testing of an ρμ-code arbiter are presented A complete design of such an arbiter is proposed and its VHDL code is synthesized for the VirtexII Pro platform FPGA of Xilinx The functionality of the unit is verified by simulations A very low utilization of available reconfigurable resources is achieved after the design is synthesized Simulations of an MPEG-4 case study suggest considerable performance speed-up in the range of 2,4-8,8 versus a pure software PowerPC implementation

01 Jan 2003
TL;DR: The SPACE algorithm supports precise exceptions, but in an improvement over the previous work, eliminates the need for most hardware register commit operations, which are used to place values in their original program location in the original program sequence.
Abstract: We describe the SPACE algorithm for translating from one architecture such as PowerPC into operations for another architecture such as VLIW, while also supporting scheduling, register al- location, and other optimizations. Our SPACE algorithm supports precise exceptions, but in an improvement over our previous work, eliminates the need for most hardware register commit op- erations, which are used to place values in their original program location in the original program sequence. The elimination of commit operations frees issue slots for other computation, a feature that is especially important for narrower machines. The SPACE algorithm is efficient, running in O(N) time in the number N of operations in the worst case, but in practice is closer to a two-pass O(N) algorithm. The fact that our approach provides precise exceptions with low overhead is useful to program- ming language designers as well — exception models in which an exception can occur at almost any instruction are not prohibitively expensive.

Book ChapterDOI
01 Sep 2003
TL;DR: The design aspects of instruction arbitration in an ρμ-coded CCM are discussed and a complete design of such an arbiter is proposed and its VHDL code is synthesized for the VirtexII Pro platform FPGA of Xilinx.
Abstract: In this paper, the design aspects of instruction arbitration in an ρμ-coded CCM are discussed. Software considerations, architectural solutions, implementation issues and functional testing of an ρμ-code arbiter are presented. A complete design of such an arbiter is proposed and its VHDL code is synthesized for the VirtexII Pro platform FPGA of Xilinx. The functionality of the unit is verified by simulations. A very low utilization of available reconfigurable resources is achieved after the design is synthesized. Simulations of an MPEG-4 case study suggest considerable performance speed-up in the range of 2,4-8,8 versus a pure software PowerPC implementation.

Journal ArticleDOI
01 May 2003
TL;DR: The architecture of the BlueGene/L massively parallel supercomputer is described, which consists of a single compute ASIC plus 256 MB of external memory and 65,536 nodes connected into a 3-d torus with a geometry of 32×32×64.
Abstract: The architecture of the BlueGene/L massively parallel supercomputer is described. Each computing node consists of a single compute ASIC plus 256 MB of external memory. The compute ASIC integrates two 700 MHz PowerPC 440 integer CPU cores, two 2.8 Gflops floating point units, 4 MB of embedded DRAM as cache, a memory controller for external memory, six 1.4 Gbit/s bi-directional ports for a 3-dimensional torus network connection, three 2.8 Gbit/s bi-directional ports for connecting to a global tree network and a Gigabit Ethernet for I/O. 65,536 of such nodes are connected into a 3-d torus with a geometry of 32×32×64. The total peak performance of the system is 360 Teraflops and the total amount of memory is 16 TeraBytes.

Book ChapterDOI
01 Jan 2003
TL;DR: The chapter describes the IBM PowerNP NP4GS31 network processor, which essentially provides wire-speed packet processing and forwarding capability through a set of programmable embedded processors and co-processors with a multiplicity of high-bandwidth embedded and external memories.
Abstract: Network processors support the complex packet processing functions at media speed. The chapter describes the IBM PowerNP NP4GS31 network processor. PowerNP essentially provides wire-speed packet processing and forwarding capability through a set of programmable embedded processors and co-processors with a multiplicity of high-bandwidth embedded and external memories. Co-processors operate in parallel with processors and perform functions that are computationally intensive to perform. PowerNP also provides the basis for a wide range of solutions from a low-end stand-alone system to a large multirack system through an external switching fabric. It is noted that PowerNP can be used in the traditional switch or router equipment as well as in the emerging applications for the server and storage equipment. It supports scalable traffic engineering functions or security features. PowerNPs provide data plane packet processing functions. A separate, external general-purpose processor (GPP) such as an IBM PowerPC provides control plane and system management functions and acts as the CP for the whole system.

01 Jan 2003
TL;DR: The main issues associated to the choices made in the construction of the new booster injector and of a smooth upgrade of the existing control system, PowerPC VME boards running Linux have been adopted, providing reliability, performance and flexibility.
Abstract: The ELETTRA control system front-end computers are presently based on 68k VME boards and the OS-9 operating system. In view of the construction of the new booster injector and of a smooth upgrade of the existing control system, PowerPC VME boards running Linux have been adopted. The new platform provides reliability, performance and flexibility, while the RTAI (Real Time Application Interface) extension offers, where necessary, satisfying real-time capabilities that compete well with those of the most popular real-time operating systems. This article describes the main issues associated to the choices we made and presents an example of application.

Proceedings ArticleDOI
29 May 2003
TL;DR: Certain methodologies that leverage the idea of systematic abstractions on microarchitectures defined at the RTL level are developed that show results of 50% improvement in simulation efficiency.
Abstract: The steady persistence of Moore's law over decades has lead to astronomical decrease in feature size supported by semiconductor manufacturing technology, which has increased the transistor count on a single chip. This has led to complex and advanced microprocessor microarchitectures and functionalities. Consequently, the need for efficient methodologies for microprocessor design has given rise to new challenges for hardware systems designers to address the ever growing concern over bridging the design-productivity gap and reduction of the time-to-market. An estimated 70% of the development cycle of a microprocessor is spent in validation, and in the absence of a standard and proven formal verification technology, simulation based validation occupies the major share of the validation cycle. Functional correctness of a microprocessor mostly refers to conformance of the microarchitecture implementation to the instruction set architecture (ISA). Through unit level, block level, full-chip level, and system level simulation the designer can check for such functional correctness. However, simulations for full-chip verification consume ample amount of time, directly affecting the design and eventually the time-to-market. Evidently, simulation efficiency is a major concern for most designers, and the development of methodologies to reduce simulation time is crucial. To address this concern we develop certain methodologies that leverage the idea of systematic abstractions on microarchitectures defined at the RTL level. We experiment with a PowerPC model written in SystemC and show results of 50% improvement in simulation efficiency.

01 Jan 2003
TL;DR: Details of the design and implementation of a 64-bit PowerPC Port and of the requirements for the Degree of Master of Science Computer Science are presented.
Abstract: OF THESIS Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science Computer Science The University of New Mexico Albuquerque, New Mexico December, 2003 Jikes Research Virtual Machine Design and Implementation of a 64-bit PowerPC Port

Proceedings ArticleDOI
22 Apr 2003
TL;DR: A software framework for the parallel execution of sequential programs using C++ classes is presented, which promises a composable multi paradigm, unified approach to parallelism across different technologies: PowerPC, DSP and FPGA.
Abstract: A software framework for the parallel execution of sequential programs using C++ classes is presented. The functional language Concurrent ML is used to implement the underlying harness and to design the programming interfaces. The hardware-independent harness promises a composable multi paradigm, unified approach to parallelism across different technologies: PowerPC, DSP and FPGA. Performance results for an image processing case study are given.

01 Jan 2003
TL;DR: The UltraController™ embedded processor solution is available as a complete reference design, with documentation, to be utilized as a lightweight PowerPC microcontroller, and clearly demonstrates substantial fabric savings by moving slow logic into the UltraController processor.
Abstract: The UltraController™ embedded processor solution is available as a complete reference design, with documentation, to be utilized as a lightweight PowerPC™ microcontroller. The 32-bit input / 32-bit output design created as a simple block, ready to integrate into larger designs, requires only a reset and a clock input. The UltraController solution utilizes the available PowerPC processor(s) in the Virtex-II Pro™ device and several block RAMs. The UltraController design is available for a variety of applications including logic and data control, device configuration, system monitoring, and simple data manipulation. A reference design, created both in fabric and on the UltraController processor, clearly demonstrates substantial fabric savings by moving slow logic into the UltraController processor. This allows users to reduce cost by utilizing smaller devices. A block diagram of the UltraController solution is shown in Figure 1.

01 Jan 2003
TL;DR: For almost ten years, the Ganil control system has been based on VMS workstations and Camac/Vme crates running Vaxeln on RtVax controllers, with Ada as common language, and Ingres as relational DataBase with Ada / SQL requests.
Abstract: For almost ten years, the Ganil control system has been based on VMS workstations and Camac/Vme crates running Vaxeln on RtVax controllers, with Ada as common language, and Ingres as relational DataBase. When Digital Equipment (now HP-Compaq) gave up with RtVax processors, we decided to move to Vme crates with PowerPC controllers running VxWorks. After that, we have also wanted to try some use of Linux, to get rid of the links remaining with Vms in the beam tuning programs and to be able to use some free software tools that were not very powerful in a Vms/Motif environment. This paper describes the milestones we performed : - Graphical user interfaces using Motif with XRT widgets - Data Base access with Ada / SQL requests - TCP/IP communication with VxWorks real time crates

Journal ArticleDOI
TL;DR: In this paper, an application of the Object-Oriented design methodology to develop operational flight program (OFP) for stores management computer (SMC) which manages and controls stores inventory, stores activation, launch for missiles, and release of the conventional weapons is proposed.
Abstract: We propose an application of the Object-Oriented design methodology to develop operational flight program(OFP) for stores management computer(SMC) which manages and controls stores inventory, stores activation, launch for missiles, and release of the conventional weapons. For the development of SMC, a military version of PowerPC 603e is used as a central processing unit board and VxWorks real-time operating system is used. The Tornado software development environment(SDE) and the programming language Ada95 are used for OFP development. We design three layerd in the OFP for the independency of the software modules. An avionics system computer(ASC) simulator and a test bench are developed for the SMC integration test and verification test. And the tests are rigorously and successfully conducted.

23 Sep 2003
TL;DR: This new technology, being developed by WorldScape and ClearSpeed, has been shown to provide ten to one hundred times the overall performance of PowerPC or Pentium-based architectures, especially when performing image and signal processing functions, such as FFTs or filters.
Abstract: : This briefing describes the development of a novel, ultra-high performance next-generation Single-Instruction/Multiple-Data (SIMD) processing architecture originally designed to realize immersive, photo-realistic 3-D simulations. This low-power, Multi-Threaded Array Processor (MTAP) architecture provides for hundreds and ultimately thousands of processing elements, each with optional floating point hardware, to perform data parallel processing on image and signal processing applications as well as for compression, encryption, search, and general sensor processing applications. The technology is supported by a flexible development environment, including assembly language and C-based language support, as well as a cycle accurate simulator, with plans to develop industry standard API Libraries based upon VSIPL and, ultimately, HPEC-SI. This new technology, being developed by WorldScape and ClearSpeed, has been shown to provide ten to one hundred times the overall performance of PowerPC or Pentium-based architectures, especially when performing image and signal processing functions, such as FFTs or filters. In general, the architecture has been shown to provide significant throughput, size, and power advantages for embedded processing applications.

Proceedings ArticleDOI
Abadir1, Zeng1, Pyron1, Zhu
08 Dec 2003
TL;DR: An automated flow for creating gate level test models from circuits at the switch level is presented, in use for the past several years within Motorola for the high performance processor family implementing the PowerPC instruction set architecture.
Abstract: Custom VLSI design at the switch level is commonly needed when a chip is required to meet stringent operating requirements in terms of speed, power, or area. ATPG requires gate level models, which are verified for correctness against switch level models. Typically, test models for custom logic are created manually from the switch level models - a tedious, error-prone process requiring experienced DFT engineers. This paper presents an automated flow for creating gate level test models from circuits at the switch level. Besides providing comparable test quality, the test model created by automated flow maintains structural similarity to the original switch-level circuit which facilitates failure analysis greatly. The automated flow has been in use for the past several years within Motorola for the high performance processor family implementing the PowerPC instruction set architecture. We present experimental results on MPC7455.

Proceedings Article
01 Jan 2003
TL;DR: In this article, an automated flow for creating gate level test models from circuits at the switch level is presented, which maintains structural similarity to the original switch-level circuit which facilitates failure analysis greatly.
Abstract: Custom VLSI design at the switch level is commonly needed when a chip is required to meet stringent operating requirements in terms of speed, power, or area. ATPG requires gate level models, which are verified for correctness against switch level models. Typically, test models for custom logic are created manually from the switch level models—a tedious, error-prone process requiring experienced DFT engineers. This paper presents an automated flow for creating gate level test models from circuits at the switch level. Besides providing comparable test quality, the test model created by automated flow maintains structural similarity to the original switch-level circuit which facilitates failure analysis greatly. The automated flow has been in use for the past several years within Motorola for the high performance processor family implementing the PowerPC instruction set architecture. We present experimental results on MPC7455.