Showing papers on "PowerPC published in 1998"

PDF

Open Access

Proceedings Article•DOI•

Design and analysis of power distribution networks in PowerPC/sup TM/ microprocessors

[...]

Abhijit Dharchoudhury¹, Rajendran Panda¹, David Blaauw¹, Ravi Vaidyanathan¹, Bogdan Tutuianu, David Bearden - Show less +2 more•Institutions (1)

Motorola¹

01 May 1998

TL;DR: A methodology for the design and analysis of power grids in the PowerPC™ microprocessors covering the need for power grid analysis across all stages of the design process is presented.

...read moreread less

Abstract: We present a methodology for the design and analysis of power grids in the PowerPC/sup TM/ microprocessors. The methodology covers the need for power grid analysis across all stages of the design process. A case study showing the application of this methodology to the PowerPC/sup TM/ 750 microprocessor is presented.

...read moreread less

190 citations

Journal Article•DOI•

Virtual memory in contemporary microprocessors

[...]

Bruce Jacob¹, Trevor Mudge•Institutions (1)

University of Maryland, College Park¹

01 Jul 1998-IEEE Micro

TL;DR: The memory management designs of a sampling of six recent processors are considered, focusing primarily on their architectural differences, and hint at optimizations that someone designing or porting system software might want to consider.

...read moreread less

Abstract: Here, we consider the memory management designs of a sampling of six recent processors, focusing primarily on their architectural differences, and hint at optimizations that someone designing or porting system software might want to consider. We selected examples from the most popular commercial microarchitectures: the MIPS R10000, Alpha 21164, PowerPC 604, PA-8000, UltraSPARC-I, and Pentium II. This survey describes how each processor architecture supports the common features of virtual memory: address space protection, shared memory, and large address spaces.

...read moreread less

115 citations

Journal Article•DOI•

Heterogeneous process migration: the Tui system

[...]

Peter A. Smith¹, Norman C. Hutchinson¹•Institutions (1)

University of British Columbia¹

01 May 1998-Software - Practice and Experience

TL;DR: Tui is a migration system that is able to translate the memory image of a program between four common architectures (m68000, SPARC, i486 and PowerPC) and requires detailed knowledge of all data types and variables used with the program.

...read moreread less

Abstract: Heterogeneous process migration is a technique whereby an active process is moved from one machine to another. It must then continue normal execution and communication. The source and destination processors can have a different architecture, that is, different instruction sets and data formats. Because of this heterogeneity, the entire process memory image must be translated during the migration. Tui is a migration system that is able to translate the memory image of a program (written in ANSI-C) between four common architectures (m68000, SPARC, i486 and PowerPC). This requires detailed knowledge of all data types and variables used with the program. This is not always possible in non-type-safe (but popular) languages such as ANSI-C, Pascal and Fortran. The important features of the Tui algorithm are discussed in great detail. This includes the method by which a program's entire set of data values can be located, and eventually reconstructed on the target processor. Performance figures demonstrating the viability of using Tui to migrate real applications are given. © 1998 John Wiley & Sons, Ltd.

...read moreread less

98 citations

Journal Article•DOI•

A decompression core for powerPC

[...]

Timothy Michael Kemp¹, R. K. Montoye¹, J. D. Harper¹, John Davis Palmer¹, Daniel J. Auerbach¹ - Show less +1 more•Institutions (1)

IBM¹

01 Nov 1998-Ibm Journal of Research and Development

TL;DR: This paper describes a method for improving code size efficiency involving the use of compression techniques to reduce the size of the stored code, and on-the-fly hardware decompression at full processor speed for execution.

...read moreread less

Abstract: Code size efficiency is a critical parameter in the design of computer systems for embedded applications. This paper describes a method for improving code size efficiency involving the use of compression techniques to reduce the size of the stored code, and on-the-fly hardware decompression at full processor speed for execution. A simple frequency-based encoding scheme for PowerPC® code achieves a typical code size reduction to 60% of the original size. A corresponding decompression core has been implemented for an embedded microprocessor, such as the PowerPC 401TM. The compression/decompression scheme operates in a manner transparent to the processor and requires no changes to such tools as compilers, linkers, and loaders.

...read moreread less

96 citations

Journal Article•DOI•

DSP processors hit the mainstream

[...]

J. Eyre, J. Bier

01 Aug 1998-IEEE Computer

TL;DR: Although fundamentally related, DSP processors are significantly different from general purpose processors (GPPs) like the Intel Pentium or PowerPC, and the authors explain what DSP processor are and what they do.

...read moreread less

Abstract: These days, the once obscure engineering term "DSP" (digital signal processing) is working its way into common use. It has begun to crop up on the labels of an ever wider range of products, from home audio components to answering machines. This is not merely a reflection of a new marketing strategy, however; there truly is more digital signal processing inside today's products than ever before. But why is the market for DSP processors booming? The answer is somewhat circular: as microprocessor fabrication processes have become more sophisticated, the cost of a microprocessor capable of performing DSP tasks has dropped significantly to the point where such a processor can be used in consumer products and other cost sensitive systems. As a result, more and more products have begun using DSP processors, fueling demand for faster, smaller, cheaper, more energy-efficient chips. Although fundamentally related, DSP processors are significantly different from general purpose processors (GPPs) like the Intel Pentium or PowerPC. The authors explain what DSP processors are and what they do. They also offer a guide to evaluating DSP processors for use in a product or application.

...read moreread less

93 citations

Journal Article•DOI•

A 1.0-GHz single-issue 64-bit powerPC integer processor

[...]

Joel Abraham Silberman¹, Naoaki Aoki, David W. Boerstler, Jeffrey L. Burns, Sang Hoo Dhong, A. Essbaum, Uttam Shyamalindu Ghoshal, David F. Heidel, Peter Hofstee, Kyung Tek Lee, David Meltzer, Hung Ngo, Kevin J. Nowka, S. Posluszny, O. Takahashi, Ivan Vo, B. Zoric - Show less +13 more•Institutions (1)

IBM¹

05 Feb 1998

TL;DR: This 64 b single-issue integer processor, comprised of about one million transistors, is fabricated in a 0.15 /spl mu/m effective channel length, six-metal-layer CMOS technology and intended as a vehicle to explore circuit, clocking, microarchitecture, and methodology options for high-frequency processors.

...read moreread less

Abstract: This 64 b single-issue integer processor, comprised of about one million transistors, is fabricated in a 0.15 /spl mu/m effective channel length, six-metal-layer CMOS technology. Intended as a vehicle to explore circuit, clocking, microarchitecture, and methodology options for high-frequency processors, the processor prototype implements 60 fixed-point compare, logical, arithmetic, and rotate-merge-mask instructions of the PowerPC instruction-set architecture with single-cycle latency. The processor executes programs written in this instruction subset from cache with a 1 ns cycle. In addition, the prototype implements 36 PowerPC load/store instructions that execute as single-cycle operations (zero wait cycles) with 1.15 ns latency. Full data forwarding and full at speed scan testing are supported.

...read moreread less

75 citations

Journal Article•DOI•

A 480 MHz RISC microprocessor in a 0.12 /spl mu/m L/sub eff/ CMOS technology with copper interconnects

[...]

Chekib Akrout¹, J. Bialas, M. Canada, D. Cawthron, J. Corr, Bijan Davari, R. Floyd, Stephen Frank Geissler, Ronald D. Goldblatt, Robert M. Houle, Paul D. Kartschoke, David Kramer, P. McCormick, Norman J. Rohrer¹, Gerard M. Salem, R. Schulz, L. Su, L. Whitney - Show less +14 more•Institutions (1)

IBM¹

05 Feb 1998

TL;DR: In this article, a 32 b 480 MHz PowerPC reduced-instruction-set-computer (RISC) microprocessor is migrated into an advanced 0.2 /spl mu/m CMOS technology with copper interconnects and multi-threshold transistors.

...read moreread less

Abstract: A 32 b 480 MHz PowerPC reduced-instruction-set-computer (RISC) microprocessor is migrated into an advanced 0.2 /spl mu/m CMOS technology with copper interconnects and multi-threshold transistors. These technology features have helped to increase the microprocessor internal clock frequency to 480 MHz at 2.0 V and 85/spl deg/C, and at the fast end of the process distribution. When operating at room temperature, the clock frequency increases to over 500 MHz. The microprocessor architecture includes two 32 KB L1 caches, one for data and one for instructions, integrated L2 cache controller working with L2 caches of 256 KB, 512 KB, or 1MB, and I/Os interfacing with the external bus using industry-standard 3.3 V. The microprocessor is implemented in 2.5 V CMOS technology and has migrated to 1.8 V CMOS technology.

...read moreread less

68 citations

Book Chapter•DOI•

New Serial and Parallel Recursive QR Factorization Algorithms for SMP Systems

[...]

Erik Elmroth¹, Fred G. Gustavson²•Institutions (2)

Umeå University¹, IBM²

14 Jun 1998

TL;DR: A hybrid recursive algorithm that outperforms the LAPACK algorithm DGEQRF by 78% to 21% as m=n increases from 100 to 1000 and an automatic variable blocking that allow us to replace a level 2 part in a standard block algorithm by level 3 operations.

...read moreread less

Abstract: We present a new recursive algorithm for the QR factorization of an m by n matrix A. The recursion leads to an automatic variable blocking that allow us to replace a level 2 part in a standard block algorithm by level 3 operations. However, there are some additional costs for performing the updates which prohibits the efficient use of the recursion for large n. This obstacle is overcome by using a hybrid recursive algorithm that outperforms the LAPACK algorithm DGEQRF by 78% to 21% as m=n increases from 100 to 1000. A successful parallel implementation on a PowerPC 604 based IBM SMP node based on dynamic load balancing is presented. For 2, 3, 4 processors and m=n=2000 it shows speedups of 1.96, 2.99, and 3.92 compared to our uniprocessor algorithm.

...read moreread less

59 citations

Proceedings Article•DOI•

An eight-issue tree-VLIW processor for dynamic binary translation

[...]

Kemal Ebcioglu¹, Jason E. Fritts, Stephen V. Kosonocky, Michael K. Gschwind, Erik R. Altman, Krishnan K. Kailas, T. Bright - Show less +3 more•Institutions (1)

IBM¹

05 Oct 1998

TL;DR: Performance simulations show that the simplicity of a VLIW architecture allows a wide-issue processor to operate at high frequencies.

...read moreread less

Abstract: Presented is an 8-issue tree-VLIW processor designed for efficient support of dynamic binary translation. This processor confronts two primary problems faced by VLIW architectures: binary compatibility and branch performance. Binary compatibility with existing architectures is achieved through dynamic binary translation which translates and schedules PowerPC instructions to take advantage of the available instruction level parallelism. Efficient branch performance is achieved through tree instructions that support multi-way path and branch selection within a single VLIW instruction. The processor architecture is described, along with design details of the branch unit, pipeline, register file and memory hierarchy for a 0.25 micron standard-cell design. Performance simulations show that the simplicity of a VLIW architecture allows a wide-issue processor to operate at high frequencies.

...read moreread less

58 citations

Journal Article•DOI•

Designing for a gigahertz [guTS integer processor]

[...]

Harm Peter Hofstee¹, S.H. Dhong, David Meltzer, Kevin J. Nowka, Joel Abraham Silberman, J.I. Burns, S. Posluszny, O. Takahashi - Show less +4 more•Institutions (1)

IBM¹

01 May 1998-IEEE Micro

TL;DR: The goal of the guTS project was to demonstrate that circuit techniques, and circuit-centric design, could significantly increase the performance of microprocessors, thus providing headroom for future performance growth beyond contributions from microarchitecture and CMOS technology.

...read moreread less

Abstract: At the IEEE International Solid State Circuits Conference this February, the IBM Austin Research Laboratory presented an experimental 64-bit integer processor called guTS (gigahertz unit Test Site). The goal of the guTS project was to demonstrate that circuit techniques, and circuit-centric design, could significantly increase the performance of microprocessors, thus providing headroom for future performance growth beyond contributions from microarchitecture and CMOS technology. To clearly distinguish the design contributions of this project from innovations in CMOS technology we chose a fabrication technology that was in production in 1997. The guTS processor is a full-custom, nearly 100% dynamic design. Its single-issue core implements 96 instructions from the integer subset of the PowerPC instruction set architecture, and covers in excess of 90% of instructions executed in typical code. Address translation, floating-point, and I/O-related instructions are omitted. All instructions, including loads and stores, execute in one cycle. We measured core speeds in excess of a gigahertz. We focus here on the circuit-centric design approach that enabled the gigahertz result. This approach requires designers to operate across the boundaries of microarchitecture, logic, circuit, and physical design. We explain why developments in CMOS technology increasingly favor this approach.

...read moreread less

56 citations

Journal Article•DOI•

Comparison of single- and dual-pass multiply-add fused floating-point units

[...]

R.M. Jessani, M. Putrino¹•Institutions (1)

IBM¹

01 Sep 1998-IEEE Transactions on Computers

TL;DR: The paper discusses the design complexities around the dual pass multiply array and its effect on area and performance in a given technology (PowerPC 604eTM and PowerPC 603eTM microprocessors).

...read moreread less

Abstract: Low power, low cost, and high performance factors dictate the design of many microprocessors targeted to the low power computing market. The floating point unit occupies a significant percentage of the silicon area in a microprocessor due its wide data bandwidth (for double precision computations) and the area occupied by the multiply array. For microprocessors designed for portable products, the design site of the floating point unit plays an important role in the low cost factor driven by reduced chip area. Some microprocessors have multiply-add fused floating point units with a reduced multiply array, requiring two passes through the array for operations involving double precision multiplies. The paper discusses the design complexities around the dual pass multiply array and its effect on area and performance. Floating point unit areas and their associated multiply array areas are compared for a single and dual pass implementation in a given technology (PowerPC 604eTM and PowerPC 603eTM microprocessors, respectively).

...read moreread less

Proceedings Article•DOI•

Design methodology for a 1.0 GHz microprocessor

[...]

S. Posluszny¹, Naoaki Aoki¹, David William Boerstler¹, Jeffrey L. Burns¹, Sang Hoo Dhong¹, Uttam Shyamalindu Ghoshal¹, P. Hofstee¹, D. LaPotin¹, Kyung Tek Lee¹, David Meltzer¹, H.C. Ngo¹, Kevin J. Nowka¹, Joel Abraham Silberman¹, Osamu Takahashi¹, Ivan Vo¹ - Show less +11 more•Institutions (1)

IBM¹

05 Oct 1998

TL;DR: The design methodology used to build an experimental 1.0 GigaHertz PowerPC integer microprocessor at IBM's Austin Research Laboratory will cover design and verification tools as well as circuit constraints and microarchitecture philosophy.

...read moreread less

Abstract: This paper describes the design methodology used to build an experimental 1.0 GigaHertz PowerPC integer microprocessor at IBM's Austin Research Laboratory. The high frequency requirements dictated the chip composition to be almost entirely custom macros using dynamic circuit techniques. The methodology presented will cover design and verification tools as well as circuit constraints and microarchitecture philosophy. The microarchitecture, circuits and tools were defined by the high frequency requirements of the processor as well as the aggressive design schedule and size of the design team.

...read moreread less

Journal Article•

Design constraints in symbolic model checking

[...]

Matt Kaufmann, Andrew Martin, Carl Pixley

01 Jan 1998-Lecture Notes in Computer Science

TL;DR: A technique for modeling environmental constraints that avoids the need for explicit construction of environments is presented and supports an assume/guarantee style of reasoning that also supports simulation monitors.

...read moreread less

Abstract: A time-consuming and error-prone activity in symbolic model-checking is the construction of environments. We present a technique for modeling environmental constraints that avoids the need for explicit construction of environments. Moreover, our approach supports an assume/guarantee style of reasoning that also supports simulation monitors. We give examples of the use of constraints in PowerPC TMl verification.

...read moreread less

Proceedings Article•DOI•

StarT-Voyager: A Flexible Platform for Exploring Scalable SMP Issues

[...]

Boon Seong Ang¹, Derek Chiou¹, Daniel L. Rosenband¹, Mike Ehrlich¹, Larry Rudolph¹, Arvind¹ - Show less +2 more•Institutions (1)

Massachusetts Institute of Technology¹

07 Nov 1998

TL;DR: The initial configuration of StarT-Voyager implements four forms of message passing along with S-COMA and NUMA shared memory support, and can be reconfigured to introduce new mechanisms improving usability and performance.

...read moreread less

Abstract: This paper describes StarT-Voyager, a machine designed as an experimental platform for research in cluster system communication. The heart of StarT-Voyager is a network interface unit (NIU) that connects the memory bus of a PowerPC-based SMP to the MIT Arctic network. The NIU is highly flexible, with its set of functions easily modified by firmware or by programmable hardware, making it possible to compare different communication interfaces and implementation strategies on a common platform. Its flexibility comes from a fast embedded processor and large, fast FPGAs that surround a high-speed protected communication core. Its efficiency comes from a set of primitive operations that are implemented in hardware and are designed to reduce the firmware overhead. Our initial configuration of StarT-Voyager implements four forms of message passing along with S-COMA and NUMA shared memory support. With experimentation on the machine, it can be reconfigured to introduce new mechanisms improving usability and performance.

...read moreread less

Journal Article•DOI•

Implementation of an environment for Monte Carlo simulation of fully 3-D positron tomography

[...]

Habib Zaidi¹, Claire Labbe¹, C. Morel¹•Institutions (1)

Geneva College¹

01 Sep 1998

TL;DR: The improved time performances resulting from parallelisation of the Monte Carlo calculations makes the Eidolon Monte Carlo program an attractive tool for modelling photon transport in 3-D positron tomography.

...read moreread less

Abstract: This paper describes the implementation of the Eidolon Monte Carlo program designed to simulate fully three-dimensional (3-D) cylindrical positron tomographs on a MIMD parallel architecture. The original code was written in Objective-C and developed under the NeXTSTEP development environment. Different steps involved in porting the software on a parallel architecture based on PowerPC 604 processors running under AIX 4.1 are presented. Basic aspects and strategies of running Monte Carlo calculations on parallel computers are described. A linear decrease of the computing time was achieved with the number of computing nodes. The improved time performances resulting from parallelisation of the Monte Carlo calculations makes it an attractive tool for modelling photon transport in 3-D positron tomography. The parallelisation paradigm used in this work is independent from the chosen parallel architecture.

...read moreread less

Proceedings Article•DOI•

Automatic generation of assertions for formal verification of PowerPC/sup TM /microprocessor arrays using symbolic trajectory evaluation

[...]

Li-C. Wang¹, Magdy S. Abadir¹, Nari Krishnamurthy¹•Institutions (1)

Motorola¹

01 May 1998

TL;DR: A novel method to automate the assertion creation process which improves the efficiency and the quality of array verification and encouraging results on recent P owerPC arrays are presented.

...read moreread less

Abstract: For verifying complex sequen tialbloc ks such as microprocessor embedded arrays, the formal method of symbolic trajectory ev aluation (STE) has achieved great success in the past [[3], [5], [6]]. P ast STE methodology for arrays requires manual creation of “assertions” to which both the RTL view and the actual design should be equivalent. In this paper, w e describe a novel method to automate the assertion creation process which improves the efficiency and the quality of array verification. Encouraging results on recent P owerPC arrays will be presented.

...read moreread less

Journal Article•DOI•

Fast Fitch-Parsimony Algorithms for Large Data Sets

[...]

Fredrik Ronquist¹•Institutions (1)

Uppsala University¹

01 Dec 1998-Cladistics

TL;DR: This paper discusses several time-saving modifications to published Fitch-parsimony tree search algorithms, including shortcuts that allow rapid evaluation of tree lengths and fast reoptimization of trees after clipping or joining of subtrees, as well as search strategies that allows one to successively increase the exhaustiveness of branch swapping.

...read moreread less

Proceedings Article•DOI•

Message passing support on StarT-Voyager

[...]

Boon Seong Ang¹, Derek Chiou, Larry Rudolph, Arvind•Institutions (1)

Massachusetts Institute of Technology¹

17 Dec 1998

TL;DR: MIT's StarT-Voyager, a hybrid message passing/shared memory parallel machine, provides four message passing mechanisms to achieve high performance over a wide spectrum of communication types and sizes.

...read moreread less

Abstract: No single message passing mechanism can efficiently support all types of communication that commonly occur in most parallel or distributed programs. MIT's StarT-Voyager, a hybrid message passing/shared memory parallel machine, provides four message passing mechanisms to achieve high performance over a wide spectrum of communication types and sizes. Hardware and address translation enforced protection allows direct user-level access to message passing facilities in a multiuser environment. StarT-Voyager's protection scheme improves upon past designs by not requiring strictly synchronized gang-scheduling, and by supporting non-monolithic protection domains. To minimize the development effort and cost, the machine is designed to use unmodified commercial PowerPC 604-based SMP systems as the building block. A Network End-point Subsystem (NES) card which plugs into one of each SMP's processor card slots provides the interface to Arctic, a low-latency, high-bandwidth network developed at MIT. This paper describes StarT-Voyager's message passing mechanisms and their predicted performance.

...read moreread less

Journal Article•DOI•

A modular computing architecture for autonomous robots

[...]

Huosheng Hu¹, Dongbing Gu, Michael Brady²•Institutions (2)

University of Reading¹, University of Oxford²

01 Mar 1998-Microprocessors and Microsystems

TL;DR: A modular computing architecture used for intelligent control of autonomous robots, which takes the form of multiple sensing and control layers, based on Locally Intelligent Control Agents in which IBM PowerPC, SIEMENS 80C166, and INMOS Transputers are adopted.

...read moreread less

Proceedings Article•DOI•

A commercial multithreaded RISC processor

[...]

Salvatore N. Storino¹, Anthony Gus Aipperspach, J. Borkenhagen, R. Eickemeyer, S. Kunkel, S. Levenstein, Gregory J. Uhlmann - Show less +3 more•Institutions (1)

IBM¹

05 Feb 1998

TL;DR: A coarse-grained hardware-multithreaded processor for use in the IBM AS1400 uses a PowerPC architecture that supports two threads that requires the replication of the processor architecture registers for each thread.

...read moreread less

Abstract: Implementation of a coarse-grained hardware-multithreaded processor for use in the IBM AS1400 uses a PowerPC architecture that supports two threads. Hardware multithreading is a technique for tolerating memory latency by utilizing otherwise idle cycles in the CPU. This requires the replication of the processor architecture registers for each thread. Replication is not required for the majority of processor logic such as instruction cache, data cache, TLB, instruction fetch and dispatch mechanisms, branch units, fixed-point units, floating-point units, and storage-control units.

...read moreread less

Proceedings Article•DOI•

Design reliability—estimation through statistical analysis of bug discovery data

[...]

Yossi Malka¹, Avi Ziv¹•Institutions (1)

IBM¹

01 May 1998

TL;DR: A study on two implementations of state-of-the-art PowerPC processors that shows that statistical analysis of bug discovery data can provide quality information on the progress of verification and good predictions of the number of bugs left in the design and the future MTTF.

...read moreread less

Abstract: Statistical analysis of bug discovery data is used in the software industry to check the quality of the testing process and estimate the reliability of the tested program. In this paper, we show that the same techniques are applicable to hardware design verification. We performed a study on two implementations of state-of-the-art PowerPC processors that shows that these techniques can provide quality information on the progress of verification and good predictions of the number of bugs left in the design and the future MTTF.

...read moreread less

Proceedings Article•DOI•

Designing for scan test of high performance embedded memories

[...]

E.K. Vida-Torku¹, G. Joos•Institutions (1)

IBM¹

18 Oct 1998

TL;DR: The addressing and clocking schemes in PowerPC/sup TM/ microprocessor embedded memories present modeling challenges and aggressive Design for Test implementations are needed to help the test generation tools.

...read moreread less

Abstract: The addressing and clocking schemes in PowerPC/sup TM/ microprocessor embedded memories present modeling challenges. The ability of most scan based test tools to accurately generate test patterns for these embedded memories is limited. What is needed is aggressive Design for Test implementations that can help the test generation tools. In this paper we present our experiences in the design, modeling, and test of high performance embedded memories on the PowerPC microprocessors.

...read moreread less

Proceedings Article•DOI•

Random self-test method applications on PowerPC/sup TM/ microprocessor caches

[...]

R. Raina¹, R. Molyneaux•Institutions (1)

Motorola¹

19 Feb 1998

TL;DR: A novel method is described that can be used to generate test stimuli that are random as well as self-testing for digital systems by taking advantage of certain properties of the Design Under Validation.

...read moreread less

Abstract: This paper describes a novel method for generating test stimuli for digital systems. By taking advantage of certain properties of the Design Under Validation, the method can be used to generate test stimuli that are random as well as self-testing. We discuss the requirements and limitations of this method on practical designs. The use of this method for High-Level Design Validation of caches in PowerPC/sup TM/ microprocessors is also described. The paper concludes by identifying areas where further work is needed.

...read moreread less

Proceedings Article•DOI•

Media processing with field-programmable gate arrays on a microprocessor's local bus

[...]

V. Michael Bove¹, Mark Lee¹, Yuan-Min Liu¹, Christopher McEniry¹, Thomas A. Nwodoh¹, John A. Watlington¹ - Show less +2 more•Institutions (1)

Massachusetts Institute of Technology¹

21 Dec 1998

TL;DR: The Chidi system is a PCI-bus media processor card which performs its processing tasks on a large field-programmable gate array in conjunction with a general purpose CPU (PowerPC 604e).

...read moreread less

Abstract: The Chidi system is a PCI-bus media processor card which performs its processing tasks on a large field-programmable gate array (Altera 10K100) in conjunction with a general purpose CPU (PowerPC 604e). Special address-generation and buffering logic (also implemented on FPGAs) allows the reconfigurable processor to share a local bus with the CPU, turning burst accesses to memory into continuous streams and converting between the memory's 64-bit words and the media data types. In this paper we present the design requirements for the Chidi system, describe the hardware architecture, and discuss the software model for its use in media processing.

...read moreread less

Journal Article•DOI•

Real time data acquisition system for control and long pulse operation in Tore Supra

[...]

D. Moulin, B. Couturier, L. Ducobu, D. Elbeze, B. Gagey, B. Guillerminet, M. Leluyer, B. Rothan - Show less +4 more

01 Aug 1998-IEEE Transactions on Nuclear Science

TL;DR: The Tore Supra Data Acquisition System was completely redesigned in 1996, for plasma control and long pulse operation, and is now based on the Rtworks package which provides all the software basic modules to build a distributed on-line system.

...read moreread less

Abstract: The Tore Supra Data Acquisition System was completely redesigned in 1996, for plasma control and long pulse operation. It is now based on the Rtworks package which provides all the software basic modules to build a distributed on-line system, from data acquisition level up to run control and data display. In the same time, the real-time plasma control system has been improved with several control units upgraded with VME hardware and PowerPC processors interconnected through a fast shared memory network.

...read moreread less

Modeling the influence of multilevel interconnect on chip performance

[...]

Bibiche Micha Geuskens

17 Mar 1998

TL;DR: The Rensselaer Interconnect Performance Estimator (RIPE) as mentioned in this paper is a design and evaluation tool, named RIPE, to analyze the impact on size, wireability, performance, power dissipation and reliability of single chip microprocessors as a function of interconnect, device, circuit, design and architectural parameters.

...read moreread less

Abstract: The purpose of this work is the development of a design and evaluation tool, named "Rensselaer Interconnect Performance Estimator" (RIPE), to analyze the impact on size, wireability, performance, power dissipation and reliability of single chip microprocessors as a function of interconnect, device, circuit, design and architectural parameters. A study of existing microprocessors and their design practices has been done to identify the parameters required to model such a system to the first order. As a result, a system model encompassing memory, core logic and I/O circuitry has been presented. Compared to earlier performance estimators, such as SUSPENS and Sai-Halasz' cycle time estimator, RIPE can accurately predict the overall performance of current microprocessor systems. For the three major microprocessor architectures: DEC, PowerPC and Intel, RIPE results indicated agreement within 10% on key parameters such as transistor count, area, wiring levels, clock frequency and power dissipation. The RIPE model has also been used to study the SIA (Semiconductor Industry Association) Roadmap predictions and technology characteristics for future microprocessor systems. The results indicate that for the 0.10 $\mu$m generation, the performance of interconnect limits overall performance and a combination of performance improving design techniques, such as interconnect length limiting floorplans, new interconnect materials and architectures, are needed to be able to meet future performance goals.

...read moreread less

Journal Article•DOI•

PPCMatrix: a PowerPC dotmatrix program to compare large genomic sequences against protein sequences

[...]

Thomas R. Bürglin¹•Institutions (1)

University of Basel¹

01 Jan 1998-Bioinformatics

TL;DR: An interactive dotmatrix program for the MacOS was designed that allows comparison of DNA to protein sequences using nested3-frame translations using nested 3- frame translations.

...read moreread less

Abstract: Summary : An interactive dotmatrix program for the MacOS was designed that allows comparison of DNA to protein sequences using nested 3-frame translations. Availability : Shareware, available at http://copan.bioz.unibas.ch/software/ Contact : burglin@ubaclu. unibas.ch

...read moreread less

Proceedings Article•DOI•

Parallel condition-code generation for high-frequency PowerPC microprocessors

[...]

J.L. Burns¹, K.J. Nowka•Institutions (1)

IBM¹

11 Jun 1998

TL;DR: A unique, high-frequency dataflow macro is described for accelerating conditional-branch resolution by computing condition codes in parallel with computing the corresponding arithmetic results to improve the microarchitecture by reducing conditional- Branches latency while achieving high speed through a pulse-node, delayed-reset dynamic circuit implementation.

...read moreread less

Abstract: Improving the speed and performance of microprocessors requires aggressive leveraging of the interplay of microarchitecture and circuit design. We describe a unique, high-frequency dataflow macro for accelerating conditional-branch resolution by computing condition codes in parallel with computing the corresponding arithmetic results. This macro improves the microarchitecture by reducing conditional-branch latency while achieving high speed through a pulse-node, delayed-reset dynamic circuit implementation. The design has been realized in a 64-bit PowerPC integer processor that operates at 1.0 GHz (0.15 micron CMOS process).

...read moreread less

Patent•

System and method for interlocking barrier operations in load and store queues

[...]

Alexander Edward Okpisz¹, Thomas Albert Petersen¹, Amy May Tuvell¹, Ronny Lee Arnold¹•Institutions (1)

IBM¹

27 Apr 1998

TL;DR: In this article, the EIEIO instruction implemented within the PowerPC architecture, block other storage access instructions at the bus interface stage as opposed to the execute stage, and cacheable instructions, and other similar instructions, are allowed to complete without being blocked by such an EIEI instruction not ordered by the instruction.

...read moreread less

Abstract: Storage access blocking instructions, such as the EIEIO instruction implemented within the PowerPC architecture, block other storage access instructions at the bus interface stage as opposed to the execute stage. Therefore, cacheable instructions, and other similar instructions, are allowed to complete without being blocked by such an EIEIO instruction not ordered by the EIEIO instruction.

...read moreread less

Journal Article•DOI•

Test strategy for the PowerPC 750 microprocessor

[...]

C. Pyron¹, J. Prado, J. Golab•Institutions (1)

Motorola¹

01 Jul 1998-IEEE Design & Test of Computers

TL;DR: Time-to-market goals are intricately entwined with the product testing strategy for a high-performance microprocessor, resulting in an on-time product introduction coupled with improved, more effective and thorough testing.

...read moreread less

Abstract: Time-to-market goals are intricately entwined with the product testing strategy for a high-performance microprocessor. The result is an on-time product introduction coupled with improved, more effective and thorough testing.

...read moreread less