scispace - formally typeset
Search or ask a question

Showing papers on "PowerPC published in 1994"


Journal ArticleDOI
01 Dec 1994
TL;DR: Low-power design techniques are used throughout the entire design, including dynamically powered down execution units, resulting in workstation level performance packed into a low-power, low-cost design ideal for notebooks and desktop computers.
Abstract: A 28 mW/MHz at 80 MHz structured-custom RISC microprocessor design is described. This 32-b implementation of the PowerPC architecture is fabricated in a 3.3 V, 0.5 /spl mu/m, 4-level metal CMOS technology, resulting in 1.6 million transistors in a 7.4 mm by 11.5 mm chip size. Dual 8-kilobyte instruction and data caches coupled to a high performance 32/64-b system bus and separate execution units (float, integer, loadstore, and system units) result in peak instruction rates of three instructions per clock cycle. Low-power design techniques are used throughout the entire design, including dynamically powered down execution units. Typical power dissipation is kept under 2.2 W at 80 MHz. Three distinct levels of software-programmable, static, low-power operation-for system power management are offered, resulting in standby power dissipation from 2 mW to 350 mW. CPU to bus clock ratios of 1/spl times/, 2/spl times/, 3/spl times/, and 4/spl times/ are implemented to allow control of system power while maintaining processor performance. As a result, workstation level performance is packed into a low-power, low-cost design ideal for notebooks and desktop computers. >

229 citations


Book
01 Jun 1994
TL;DR: The PowerPC Architecture is a must for anyone who needs to understand the levels of compatibility between different processors in the PowerPC family-the 601 microprocessor, the 603 (low-end, battery-powered requirements), 604 (optimized price/performance for scaleable symmetric multiprocessors), and the 620 (for high-end technical and commercial requirements about performance).
Abstract: This is the official technical description of the PowerPC architecture and its hardware conventions, developed jointly by IBM, Motorola, and Apple. The book is an essential reference for hardware and system-software designers and applications programmers developing a range of products using implementations of the PowerPC family of microprocessors-from palmtops to teraFLOPS. The PowerPC architecture provides a stable base for software, allowing applications that run on one PowerPC processor to run consistently on any other PowerPC processor. In addition, well-designed operating systems can be moved from one processor implementation to another by making only a few minor changes. To achieve this, the specification of the architecture has been structured into three Books, corresponding to a distinct level of the architecture: Book I, User Instruction Set Architecture, describes the registers, instructions, storage model, and execution model that are available to all application programs. Book II, Virtual Environment Architecture, describes features of the architecture that permit application programs to create or modify code, to share data among programs in a multiprocessing system, and to optimize the performance of storage accesses. Book III, Operating Environment Architecture, describes features of the architecture that permit operating systems to allocate and manage storage, to handle errors encountered by application programs, to support I/O devices, and to provide the other services expected of secure, modern multiprocessor operating systems. An important feature of these specifications is that they only constrain implementations on matters that affect software compatibility. Even more significant, they specify the architecture in a manner that is independent of implementation. The PowerPC Architecture is a must for anyone who needs to understand the levels of compatibility between different processors in the PowerPC family-the 601 microprocessor, the 603 (low-end, battery-powered requirements), 604 (optimized price/performance for scaleable symmetric multiprocessors), and the 620 (for high-end technical and commercial requirements about performance).

226 citations


Patent
11 Jan 1994
TL;DR: In this paper, a dual-instruction-set CPU is able to execute x86 CISC (complex instruction set computer) code or PowerPC RISC (reduced instruction set computers) code.
Abstract: A dual-instruction-set CPU is able to execute x86 CISC (complex instruction set computer) code or PowerPC RISC (reduced instruction set computer) code Three modes of operation are provided: CISC mode, RISC mode, both called user modes, and emulation mode Emulation mode is entered upon reset, and performs various system checks and memory allocation A special emulation driver is loaded into a portion of main memory set aside at reset Software routines to emulate the more complex instructions of the CISC architecture using RISC instructions are also loaded into the emulation memory A TLB is enabled, and translation tables and drivers are set up in the emulation memory All TLB misses, even in the user modes, will cause entry to a translator driver in emulation mode Since the TLB is always enabled for the user modes, and all misses are handled by the emulation code, the emulation code can set aside a portion of memory for itself and insure that the user programs never have access to the emulation memory Thus the programs, including operating systems, in CISC or RISC mode are unaware of emulation memory or even the existence of emulation mode

128 citations


Journal ArticleDOI
TL;DR: While keeping the system interface compatible with the 601 microprocessor, the 604 microprocessor is improved upon it by incorporating a phase-locked loop and an IEEE-Std 1149.1 boundary-scan QTAG) interface on chip.
Abstract: Somerset Design Center he 604 microprocessor is the third member of the PowerPC family being developed jointly by Apple, IBM, and Motorola. Developed for use in desktop personal computers, workstations, and servers, this 32-bit implementation works with the software and bus in the PowerPC 601 and 603 microprocessors.‘m3 While keeping the system interface compatible with the 601 microprocessor, we improved upon it by incorporating a phase-locked loop and an IEEE-Std 1149.1 boundary-scan QTAG) interface on chip. In addition, an advanced machine organization delivers one and a half to two times the 601’s integer performance.

125 citations


Proceedings ArticleDOI
24 Oct 1994
TL;DR: The ability of recognising traffic signs in a road traffic scenario is an important feature of the Daimler-Benz autonomous vehicle VITA II and the overall system design, the real-time implementation, and field test evaluation are focused.
Abstract: The ability of recognising traffic signs in a road traffic scenario is an important feature of the Daimler-Benz autonomous vehicle VITA II. This real-time vision-based traffic sign recognition system has been developed by Daimler-Benz in the European research project PROMETHEUS. In this paper we focus on the overall system design, the real-time implementation, and field test evaluation. The software architecture of the system integrates three hierarchical levels of data processing. On each level the specific tasks are isolated. The lowest level comprises specialists for colour, shape and pictogram analysis; they perform the iconic to symbolic data transformation. On the highest level the administration processes organise data flow as a double bottom-up and top-down mechanism to dynamically interpret the image sequence. A hybrid parallel machine was designed for running the traffic sign recognition system in real time on a transputer network coupled to powerPC processors.

89 citations


Patent
Paul Borrill1
31 Oct 1994
TL;DR: The first stateless multiplatform instruction set architecture (ISA) as mentioned in this paper uses a very long instruction word (VLIW) architecture with 64-bit instructions, of which several highorder bits are reserved for an ISA identifier tag.
Abstract: A method and apparatus for providing a stateless multiplatform instruction set architecture (ISA) for use in a computer system having a processor and memory storing a control program for implementing the invention. The system is used to statelessly execute instructions authored to correspond to a variety of different ISA's on a unitary platform. The ISA of the invention uses a very long instruction word (VLIW) architecture with 64-bit instructions, of which several high-order bits are reserved for an ISA identifier tag. When the processor receives an instruction for execution, it inspects the instruction to determine from the ISA identifier tag to which original, native ISA the instruction corresponds. If the corresponding ISA is the native VLIW ISA for the processor, then the instruction is routed to the instruction dispatch unit of the processor, and thence to at least one functional unit for execution. If the corresponding ISA is not the native VLIW ISA, then the instruction is routed to one of a plurality of dynamic decode units (DDU's), each DDU being controlled by a translation routine that translates the instructions from a non-native ISA to the native VLIW ISA. The translated instructions are then sent to the instruction dispatch unit, and on to the appropriate functional unit(s). Any instruction that includes unused bits, such as 64 bit instructions with free higher-order bits, can accommodate the ISA identifier tag by simply using the unused bits. Instructions that do not include unused bits, such as 32-bit instructions for non-VLIW architectures (e.g. the ISA's for SPARC, PowerPC or x86), are appended with additional bits to bring the total to 64 bits, several of which are reserved for the ISA tag. The number of bits reserved for the ISA tag determines the number of non-native ISA's that are recognized by the system; e.g., three bits allows for the native ISA plus seven non-native ISA's to be recognized by the system. Incoming instructions corresponding to a non-native ISA for which no dynamic decode unit is available can be executed by conventional software emulation. Entire programs written for non-native ISA's (using, e.g., 32-bit instructions) can be converted to the format for the native VLIW ISA by appending, at the instruction loading stage or in a separate process independent of execution, the additional bits necessary both to fill out the instruction word lengths and to include the ISA identifier tag bits.

67 citations


Journal ArticleDOI
TL;DR: The PowerPC 603 incorporates a variety of features to reduce power dissipation: dynamic idle-time shutdown of separate execution units, low-power cache design, and power considerations for standard cells, data-path elements, and clocking.
Abstract: The PowerPC 603 incorporates a variety of features to reduce power dissipation: dynamic idle-time shutdown of separate execution units, low-power cache design, and power considerations for standard cells, data-path elements, and clocking. System-level features include three software-programmable static power management modes and a hardware-programmable phase-lock loop. Operating at 80 MHz, the 603 typically dissipates 2.2 W, while achieving an estimated 75 Specint92 and 85 Specfp92. >

60 citations


Journal ArticleDOI
TL;DR: The PowerPC, a new RISC architecture derived from IBM’s POWER architecture, is currently available as an open-system standard to provide both software compatibility among PowerPC processors and maximum implementation flexibility.
Abstract: The PowerPC, a new RISC architecture derived from IBM’s POWER architecture, is currently available as an open-system standard. To provide both software compatibility among PowerPC processors and maximum implementation flexibility, its designers specified the PowerPC architecture in four books. As we show here, Book 1 describes the user-mode programming model and instruction set common to all PowerPC processors.

50 citations


Journal ArticleDOI
James E. Smith1, Shlomo Weiss
TL;DR: A discussion is given on two RISC implementations: from Digital Equipment Corporation, the Alpha 21064, and from IBM/Motorola/Apple, the PowerPC 601; both are superscalar implementations, that is, they can sustain execution of two or more instructions per clock cycle.
Abstract: A discussion is given on two RISC implementations: from Digital Equipment Corporation, the Alpha 21064, and from IBM/Motorola/Apple, the PowerPC 601. Both are superscalar implementations, that is, they can sustain execution of two or more instructions per clock cycle. Otherwise, these two implementations present vastly different philosophies for achieving high performance. The PowerPC 601 focuses on powerful instructions and great flexibility in processing order, while the Alpha 21064 depends on a very fast clock, with simpler instructions and a more streamlined implementation structure. These two RISC microprocessors exemplify contrasting, but equally valid, implementation philosophies. An overview is given of the instruction sets and the authors emphasize the differences in design: PowerPC uses powerful instructions so that fewer are needed to get the job done; Alpha uses simple instructions so that the hardware can be kept simpler and faster. The authors also discuss the pipelined implementations of the two architectures; again, the contrast is between powerful and simple. >

47 citations


Journal ArticleDOI
Ali A Poursepanj1
TL;DR: The PowerPC performance modeling was based on trace-driven simulation, where the microprocessor organization is specified as a model, benchmark traces are generated and applied to the model, and performance data is measured and analyzed.
Abstract: The PowerPC performance modeling was based on trace-driven simulation, where the microprocessor organization is specified as a model, benchmark traces are generated and applied to the model, and performance data is measured and analyzed. In spite of the advantages of trace-driven simulation, which will be described later, meaningful benchmark traces are large and require prohibitive amounts of storage and time to analyze. There are a number of advantages to a trace-driven approach. There is no need to generate program results in a trace-driven model. Designers can concentrate on performance implications of architectural features without having to worry about generating correct results or modeling system I/O and other overhead. Traces can be statistically sampled to reduce simulation time with a very acceptable reduction in accuracy. This allows relatively quick analysis of large benchmarks, such as the SPEC 92 suite. The usefulness of the PowerPC performance models does not end at the processor design phase. Compiler developers need models to help tune compilers IO the PowerPC instruction set. System designers need to make trade-offs on system cache and memory designs. System software groups need to use processor models to measure and tune the performance of libraries and low-level operating system services. Prospective customers who want to run their specific applications on the performance model to determine if the part meets their needs are viable users of the model. The PowerPC performance model is also used by the design team to assist in detecting performance bugs (e.g., unexpected bubbles) in the final logic model. The following sections will discuss performance modeling and the trace generation and simulation mcthodol-ogy used for the PowerPC microprocessors. Trace-driven Performance Modeling Performance Models Processor performance models can be roughly divided into two types: trace-driven models and execution-based models. Trace-driven models simulate the data flow of the processor , but do not actually execute instructions or generate results. Instead , they are fed a program trace, called a dynamic trace, which contains the dynamic sequence of instruction addresses, instruction op-codes, and data addresses that

45 citations


Proceedings ArticleDOI
09 Jun 1994
TL;DR: A 3.3 V Phase-Locked-Loop (PLL) clock synthesizer implemented in 0.5 μm CMOS technology is described in this paper, which supports internal to external clock frequency ratios of 1, 1.5, 2, 3, and 4 as well as numerous static power down modes for PowerPC microprocessors.
Abstract: A 3.3 V Phase-Locked-Loop (PLL) clock synthesizer implemented in 0.5 μm CMOS technology is described. The PLL support internal to external clock frequency ratios of 1, 1.5, 2, 3, and 4 as well as numerous static power down modes for PowerPC microprocessors. The CPU clock lock range spans from 6 to 175 MHz. Lock times below 15 μs, PLL power dissipation below 10 mW as well as phase error and jitter below ±100 ps have been measured. The total area of the PLL is 0.52 mm 2

Proceedings ArticleDOI
C. Hunter1, J. Slaton1, J. Eno1, R. Jessani1, C. Dietz2 
02 Oct 1994
TL;DR: The RAM BIST design implemented on the PowerPC 603 microprocessor encompasses a novel state machine design built using logic synthesis tools and is constrained by the need to minimize area overhead while providing high test coverage and rapid at-speed testing.
Abstract: The PowerPC 603 microprocessor is designed for low power, low cost computing applications. A RAM built-in-self-test (BIST) implementation tests the split 8k instruction and data caches and the tag arrays. The design is constrained by the need to minimize area overhead while providing high test coverage and rapid at-speed testing. The solution encompasses a novel state machine design built using logic synthesis tools. This paper presents the RAM BIST design implemented on the PowerPC 603 microprocessor.

Proceedings ArticleDOI
C. Hunter1, E.K. Vida-Torku2, J. LeBlanc2
02 Oct 1994
TL;DR: The testability and manufacturability features implemented in the PowerPC 603 microprocessor are presented, as well as the issues involved in reconciling a common test plan for two fabrication facilities with differing expectations.
Abstract: The PowerPC 603 microprocessor is a high performance, low power, and low cost RISC microprocessor which was designed at the Somerset Design Center by a team of Motorola, IBM and Apple engineers. The testability and manufacturability features implemented in the PowerPC 603 microprocessor are presented, as well as the issues involved in reconciling a common test plan for two fabrication facilities with differing expectations.

Book
01 Jan 1994
TL;DR: This book discusses Pipelined Structure of the CPU, PowerPC Architecture, and Architectural Support for Multiprocessing, as well as implementing PowerPC 601 and POWER1 Implementations, which address the challenges of modern computer design.
Abstract: Foreword Preface 1 Modern Computer Design Concepts 1.1 Introduction 1.2 RISC Architectures 1.3 An Introduction to Pipelining 1.4 Beyond Simple Pipelines 1.5 Instruction Scheduling 1.6 Modern Computer Systems 1.7 POWER and PowerPC: The Big Picture 1.8 The Rest of the Book 1.9 References 2 POWER Architecture 2.1 Introduction 2.2 Instruction Set Basics 2.3 Fixed-Point Unit 2.4 Branch Unit 2.5 Floating-Point Unit 2.6 Virtual Address Generation and Translation 2.7 Protection 2.8 References 3 POWER1 Implementation: Pipelines 3.1 Introduction 3.2 Pipelined Structure of the CPU 3.3 Branch Unit 3.4 Fixed-Point Unit 3.5 Floating-Point Unit 3.6 References 4 POWER1 Implementation 4.1 Introduction 4.2 Solving the Branch Problems 4.3 Branches in the POWER1 4.4 Precise Interrupts 4.5 Interrupts in POWER1 4.6 References 5 POWER1 Implementation: Cache Memories 5.1 Introduction 5.2 Cache Memory Overview 5.3 POWER1 Instruction Cache 5.4 POWER1 Data Cache 5.5 References 6 POWER2: The Next Generation 6.1 Introduction 6.2 POWER Architecture Extensions 6.3 Pipeline Overview 6.4 Branch Unit 6.5 Fixed-Point Unit 6.6 Floating-Point Unit 6.7 Instruction Cache 6.8 Data Cache 6.9 Summary 6.10 References 7 PowerPC Architecture 7.1 Introduction 7.2 Fixed-Point Unit 7.3 Branch Unit 7.4 Floating-Point Unit 7.5 Virtual Address Generation and Translation 7.6 PowerPC versus POWER: Simplification 7.7 PowerPC versus POWER: Extensions 7.8 Summary and Conclusions 7.9 References 8 PowerPC 601 Implementation 8.1 Introduction 8.2 Pipelines 8.3 Branch Processing 8.4 Cache Memory 8.5 PowerPC 601 and POWER1 Implementations 8.6 Summary and Conclusions 8.7 References 9 PowerPC: Support for Multiprocessing 9.1 Introduction 9.2 Architectural Support for Multiprocessing 9.3 Memory Ordering 9.4 Cache Coherence 9.5 Higher-Level Caches 9.6 Cache and Lookaside Buffer Management 9.7 References 10 System Organization 10.1 Introduction 10.2 PowerPC Personal Computers 10.3 PowerPC Multiprocessor Systems 10.4 RS/6000 Workstation Overview 10.5 RS/6000 Main memory 10.6 RS/6000 Input/Output System 10.7 RS/6000 Clustered Multicomputers 10.8 Summary 10.9 References 11 PowerPC 601 and Alpha 21064 11.1 Introduction 11.2 Implementation Overview 11.3 Architecture Comparison 11.4 Summary 11.5 References A IEEE 754 Floating-Point Standard A.1 Floating-Point Numbers A.2 Floating-Point Exceptions B POWER Instruction Format C POWER Instruction Set Sorted by Mnemonic D PowerPC Instruction Formats E PowerPC Instruction Set Sorted by Mnemonic F Cross Reference for Changed POWER Mnemonics Bibliography Index

Journal ArticleDOI
TL;DR: The PowerPC is a new RISC architecture derived from IBM's POWER architecture that simplifies implementations, increase clock rates, enable a higher degree of superscalar execution, extend the architecture to 64 bits, and add multiprocessor support.
Abstract: The PowerPC is a new RISC architecture derived from IBM's POWER architecture. The changes made to POWER simplify implementations, increase clock rates, enable a higher degree of superscalar execution, extend the architecture to 64 bits, and add multiprocessor support. For compatibility with existing software, the developers retained POWER's basic instruction set, opcode assignments, and programming model. >

Proceedings ArticleDOI
S. Gary1, C. Dietz, J. Eno, G. Gerosa, Sung Park, H. Sanchez 
28 Feb 1994
TL;DR: Various design features optimize the PowerPC 603 for both power and performance, creating an ideal microprocessor solution for portable applications.
Abstract: The PowerPC 603 microprocessor is a low-power implementation of the PowerPC architecture The superscalar organization includes dynamic localized shutdown of execution units to reduce normal-mode power consumption Three levels of static low-power operation are software programmable for system power management The 603 PLL (phase lock loop) is capable of generating an internal processor clock at 1/spl times/, 2/spl times/, 3/spl times/ or 4/spl times/ the system clock speed to allow control of system power while maintaining processor performance Various design features optimize the 603 for both power and performance, creating an ideal microprocessor solution for portable applications >

Proceedings ArticleDOI
28 Feb 1994
TL;DR: The PowerPC 603 microprocessor is the second member of the PowerPC microprocessor family featuring low power operation of less than 3 watts while maintaining high performance of 75 SPECint92 (estimated) at 80 MHz.
Abstract: The PowerPC 603 microprocessor is the second member of the PowerPC microprocessor family. The 603 is a superscalar implementation featuring low power operation of less than 3 watts while maintaining high performance of 75 SPECint92 (estimated) at 80 MHz. The 7.4 mm by 11.5 mm design is implemented in 0.5 /spl mu/m, four-level metal CMOS technology. The 603 features dual 8-kByte instruction and data caches and a 32/64-bit system bus. Peak instruction rates of 3 instructions per clock cycle give outstanding performance to notebook and portable applications. >

Proceedings ArticleDOI
15 Jun 1994
TL;DR: The DM/6000 prototype is a fault-tolerant/durable-memory RS/6000, based on the IBM PowerPC 601 microprocessor and is equivalent in performance and software appearance to a conventional 4-way shared bus, cache coherent, symmetric multiprocessor (SMP), with 4 gigabytes of non-volatile main storage.
Abstract: The DM/6000 prototype is a fault-tolerant/durable-memory RS/6000. The main storage of this system is battery backed so as to maintain memory content across prolonged power interruptions. In addition, there are no single points of failure, and all likely multiple failure scenarios are covered. The prototype is intended to match the data integrity and availability characteristics of RAID5 disks. Redundancy is managed in hardware and in transparent to the software; application programs and the operating system (AIX) can run unmodified. The prototype is based on the IBM PowerPC 601 microprocessor operating at 80 MHz and is equivalent in performance and software appearance to a conventional 4-way shared bus, cache coherent, symmetric multiprocessor (SMP), with 4 gigabytes of non-volatile main storage. >

Book
01 Nov 1994
TL;DR: Delineates the innovations and advances that led to the development of Intel's Pentium and IBM/Motorola/Apple's PowerPC, and explores the potential design and implementation of instruction-level parallelism in modern processors.
Abstract: Delineates the innovations and advances that led to the development of Intel's Pentium and IBM/Motorola/Apple's PowerPC, and explores the potential design and implementation of instruction-level parallelism in modern processors. Papers illustrate solutions to the true data dependency problem and the

Journal ArticleDOI
TL;DR: Using the best tools and methodoiogy available, the design team took the 603 from concept to working silicon in 18 months, providing fully functional first-pass silicon that ran at the design target speed of BOMHz.
Abstract: Following closely on the heels of its predecessor, the PowerPC 601 microprocessor [ 11, the 603 microprocessor was developed at the joint Motorola/ IBM/Apple Somerset Design Center in Austin, Texas. The 603 microarchitecture evolved from Apple, IBM, and Motorola’s collective experience on several past designs. The similarity of the POWER and PowerPC architectures permitted the use of sample traces generated by RISC System/6000 machines for evaluation of design trade-offs. The compiler groups also provided their insight to ensure the traces from the past generation of processors and compilers, with their own specific peculiarities, did not misguide the 603’s microarchitecture definition, and that tradeoffs selected were appropriate for the next generation of compilers. To accelerate the design and test process, engineers employed a formal VLSI desigrl mrthodolugy derived from the best ofboth IBM and Motorola’s CAD tools. These tools enable both the rapid design and dense packing capability necessary to produce very high-volume, high-yield microprocessors for the commercial market. The 603 design team employed a combination of custom circuitry (for arrays), library components (for data paths), and standard cell place and route (for random logic) to accomplish the 603 design. Using the best tools and methodoiogy available, the design team took the 603 from concept to working silicon in 18 months. Ongoing design evaluation and debugging, including simulation of 28 billion processor cycles prior to tape-out, provided fully functional first-pass silicon that ran at the design target speed of BOMHz. The PowerPC 603 microprocessor is manufactured by Motorola in Ausrm,Texas, and by IBM in Burlingron, Vt. Motorola and IBM both fabricate the 603 using a 0.5pm, 4.level metal, 3.3VDC CMOS process with design rules compatible with both companies’ semiconductor processes. The die is designed to be packaged in either a 240-pin ceramic quad flat pack or a ball-grid array package. Figure 1 is a photograph of the 603 die.

Journal ArticleDOI
Brad W. Suessmith1, George Paap1
TL;DR: The PowerPC 603 allows the system designer to control energy consumption through both hardware and software means as well as providing automatic internal power management.
Abstract: Addressing the need for long battery life in portable applications and environmental concerns about energy consumption requires microprocessors with low power consump tion as well as high performance. A primary design goal of the PowerPC 603 microprocessor was to provide sophisticated power management without compromising nextgenera-tion performance. As a result, the 603 is ideal for portable applications such as laptop computers in addition to Energy Star compliant desktop computers. The PowerPC 603 allows the system designer to control energy consumption through both hardware and software means as well as providing automatic internal power management.

Proceedings ArticleDOI
A. Poursepanj1, D. Ogden1, B. Burgess, S. Gary, Carl D. Dietz, D. Lee, S. Surya, M. Peters 
28 Feb 1994
TL;DR: Performance modeling was used in conjunction with application code traces to tune the PowerPC 603 microprocessor design and simulation model execution of fragments of the traces verified the performance model accuracy.
Abstract: Performance modeling was used in conjunction with application code traces to tune the PowerPC 603 microprocessor design This modeling technique allowed the design space to be constrained by performance, power and size Trade-offs were examined with high confidence of final performance Sampled traces provided a fast turnaround for evaluation of the design space Finally, simulation model execution of fragments of the traces verified the performance model accuracy >

Journal ArticleDOI
TL;DR: This article will show that this new bus definition has demonstrated sufficient performance capability and flexibility to become the standard interface for several follow-on projects, including the PowerPC 603 and 604 microprocessors.
Abstract: he 601 is the first implementation of the PowerPC architecture. Given this project’s aggressive schedule goals, the 601 designers chose to use Motorola’s existing 88110 bus, with some enhancements, rather than introducing an entirely new bus definition. As this article will show, this new bus definition has demonstrated sufficient performance capability and flexibility to become the standard interface for several follow-on projects, including the PowerPC 603 and 604 microprocessors. The bus definition elements common to all of these projects-PowerPC 601, 603, and 60Gform the PowerPC 60X bus. The 60X bus must support a wide range of system configurations, including single-processor, low-cost laptop machines, high-performance desktop personal computers and workstations, and multiprocessor, file and compute server systems. Figure 1 shows a typical box system configuration. The bus provides the interconnection and transfer protocols between one or more processor nodes, memory, and typically at least one expansion bridge to a system bus such as PCI, Microchannel, or VME. The processor node consists of two or more PowerPC microprocessors and, optionally, a secondary cache (L2 cache). Figure 1 also shows the L2 cache as a look-through design, though lookaside cache designs can be used. Depending on the goals for a particular system, the graphics subsystem (not shown) may attach directly to the 60X bus, or to the system bus. The primary considerations for the 601 project were quick time to market and high performance. We targeted the 601 towards systems ranging from desktop PCs to multiprocessor file/compute servers. Therefore, the bus definition had to provide a robust multiprocessing solution with minimal silicon and schedule impact. The 603 project goals dictated reduced multiprocessing support and several minor changes in the bus definition, while maintaining backwards compatibility with the 601. Though the 604 maintains this backwards compatibility with the 601, it again incorporates a small evolutionary step forward in the bus definition.

Journal ArticleDOI
William C. Anderson1
TL;DR: Simulator Family The successful introduction of a new computer architecture into the marketplace requires that both software and hardware be available simultaneously at the time of system introduction.
Abstract: Simulator Family The successful introduction of a new computer architecture into the marketplace requires that both software and hardware be available simultaneously at the time of system introduction. Moreover, there is substantial need for software tool support (e.g., compilers and simulators) during the design phase of the microprocessor itself. Such phased development necessitates coordination among several groups:

Proceedings ArticleDOI
16 Feb 1994
TL;DR: This superscalar microprocessor is a 32b implementation of the PowerPC Architecture that offers workstation-level performance packed into a low-power consumption, low-cost design ideal for notebooks and desktop computers.
Abstract: This superscalar microprocessor is a 32b implementation of the PowerPC Architecture. With an estimated performance/power ratio of 25SPECint92/W at 80 MHz, this RISC style chip offers workstation-level performance packed into a low-power consumption, low-cost design ideal for notebooks and desktop computers. >

Proceedings ArticleDOI
S. Surya1, Pradip Bose1, J.A. Abraham
10 Oct 1994
TL;DR: Two key strategies within the overall validation methodology are focused on: transient mode testing; and steady-state parametric testing for validating functional timing models coded to predict instructions-per-cycle (IPC) performance for an advanced superscalar processor family.
Abstract: We consider the problem of validating a functional (architectural) timing model coded to predict instructions-per-cycle (IPC) performance for an advanced superscalar processor family. We present a methodology based on loop test cases for validating such models. For the purpose of this paper, we focus on two key strategies within our overall validation methodology: transient mode testing; and steady-state parametric testing. We state a few key lemmas characterizing the underlying theory and present a set of experimental results to illustrate the use of these validation strategies. >

Journal ArticleDOI
Julie Shipnes1, Mike Phillip1
TL;DR: The modular structure of the compiler, the data and information flowThrough examples and descriptions, an understanding can be achieved as to how the Motorola PowerPC compilers are designed to provide the high performance and diversity that is essential to the PowerPC Architectwe.
Abstract: The need for balancehetween software and hardware is a well-known principle of RISC microprocessor design methodologies. In order to achieve a high level of performance, RISC microprocessors are designed to allow compilers to take full advantage of the pipelines and resources available. The PowerPC family of microprocessors is being designed to he used for many purposes, ranging from low-power embedded controllers to powerful, supercomputerclass multiprocessor systems. This diversity of uses will lead to an equally diverse set of operating system environments, including AIX, Macintosh OS, Solaris and Windows/NT, among others. Despite the multitude of PowerPC processor and system configurations being developed, there remains a need for highly optimizing compilers that utilize both the base PowerPC Architecture as well as other implementations of the chips and systems designed around the architecture. Motorola has developed a highly optimizing, modular compilation environment that can be quickly adapted to various PowerPC microprocessor and system configurations. This suite of C, C++ and Fortran compilers is designed to meet the following criteria: l Highly optimizing, ensuring opri-mal performance for PowerPC microprocessors l Highly retargetable, ensuring rapid time-to-market l Highly configurable, supporting multiple object file and debugging formats l Compliant to software standards, ensuring portability of code between chips This article will describe the modular structure of the compiler, the data and information flow through the major phases of the compiler, and offer some discussion on architecture and implementation-specific optimizations currently performed in the PowerPC compilers. Through examples and descriptions, an understanding can be achieved as to how the Motorola PowerPC compilers are designed to provide the high performance and diversity that is essential to the PowerPC Architectwe. The Motorola compilation system, based partially on technology acquired from Apogee Software for the 88000 architecture, consists of a series of components that collectively =a \" provide highly optimized PowerPC microprocessor code for a wide range of source languages, object file and debugging formats, and system environments. Conceptually, the heart of the Motorola compilation system is a common core that integrates multiple front ends with target-specific code generators to provide consistently high performance across an extremely wide range of target environments (see


Proceedings ArticleDOI
10 Oct 1994
TL;DR: The timing verification and optimization tools used for the 601, 603, 604 and 620 PowerPC processor designs are presented and a method for automatically deriving timing constraints for timing optimization is described.
Abstract: This paper presents the timing verification and optimization tools used for the 601, 603, 604 and 620 PowerPC processor designs. The timing verification is done by static timing analysis at the chip level, while the timing optimization is done by synthesis at the macro level. A method for automatically deriving timing constraints for timing optimization is described. >

Proceedings ArticleDOI
John E. Bertsch1, Kerry Bernstein1, L. Heller1, E. J. Nowak1, Francis Roger White1 
07 Jun 1994
TL;DR: In this article, an experimental 2.0 V PowerPC 601 microprocessor demonstrating 3/spl times/ active power reduction and performance comparable to the 3.6 V version has been fabricated.
Abstract: An experimental 2.0 V PowerPC 601 microprocessor demonstrating 3/spl times/ active power reduction and performance comparable to the 3.6 V version has been fabricated. The standard 3.6 V 0.6 /spl mu/m CMOS technology was modified for low-power operation with unmodified circuits/masks. No degradation to yield was observed. Experimental low-voltage PowerPC 601 process/device alterations and test results are described. >