scispace - formally typeset
Search or ask a question

Showing papers on "PowerPC published in 1995"


Proceedings ArticleDOI
01 Jan 1995
TL;DR: A new methodology and test program generator have been used for the functional verification of six IBM PowerPC processors and despite the complexity of the PowerPC architecture, the three processors verified so far had fully functional first silicon.
Abstract: A new methodology and test program generator have been used for the functional verification of six IBM PowerPC processors. The generator contains a formal model of the PowerPC architecture and a heuristic data-base of testing expertise. It has been used on daily basis for two years by about a hundred designers and testing engineers in four IBM sites. The new methodology reduced significantly the functional verification period and time to market of the PowerPC processors. Despite the complexity of the PowerPC architecture, the three processors verified so far had fully functional first silicon.

183 citations


Journal ArticleDOI
TL;DR: In this article, a 3.3 V Phase-Locked-Loop (PLL) clock synthesizer implemented in 0.5 /spl mu/m CMOS technology is described.
Abstract: A 3.3 V Phase-Locked-Loop (PLL) clock synthesizer implemented in 0.5 /spl mu/m CMOS technology is described. The PLL supports internal to external clock frequency ratios of 1, 1.5, 2, 3, and 4 as well as numerous static power down modes for PowerPC microprocessors. The CPU clock lock range spans from 6 to 175 MHz. Lock times below 15 /spl mu/s, PLL power dissipation below 10 mW as well as phase error and jitter below /spl plusmn/100 ps have been measured. The total area of the PLL is 0.52 mm/sup 2/. >

113 citations


Journal Article
TL;DR: In this paper, a 3.3 V Phase-Locked-Loop (PLL) clock synthesizer implemented in 0.5 /spl mu/m CMOS technology is described.
Abstract: A 3.3 V Phase-Locked-Loop (PLL) clock synthesizer implemented in 0.5 /spl mu/m CMOS technology is described. The PLL supports internal to external clock frequency ratios of 1, 1.5, 2, 3, and 4 as well as numerous static power down modes for PowerPC microprocessors. The CPU clock lock range spans from 6 to 175 MHz. Lock times below 15 /spl mu/s, PLL power dissipation below 10 mW as well as phase error and jitter below /spl plusmn/100 ps have been measured. The total area of the PLL is 0.52 mm/sup 2/. >

95 citations


Book ChapterDOI
29 Aug 1995
TL;DR: The ability to be both a collection of standard SMP and an aggressive message passing machine with coherent shared memory makes StarT-ng a good building block for incrementally expandable parallel machines.
Abstract: StarT-ng is a joint MIT-Motorola project to build a high-performance message passing machine from commercial systems. Each site of the machine consists of a PowerPC 620-based Motorola symmetric multiprocessor (SMP) running the AIX 4.1 operating system. Every processor is connected to a low-latency, high-bandwidth network that is directly accessible from user-level code. In addition to fast message passing capabilities, the machine has experimental support for cachecoherent shared memory across sites. When the machine requires memory to be kept globally coherent, one processor on each site is devoted to supporting shared memory. When globally coherent shared memory is not required, that processor can be used for normal computation tasks. StarT-ng will be delivered at about the time the base SMP is introduced into the marketplace. The ability to be both a collection of standard SMP and an aggressive message passing machine with coherent shared memory makes StarT-ng a good building block for incrementally expandable parallel machines.

65 citations


Proceedings ArticleDOI
Anthony Correale1
23 Apr 1995
TL;DR: The need for power efficient embedded controllers became obvious when the explosion in portable communications and consumer segments began, and the IBM PowerPC TM 4xx 1 embedded controller family addresses a wide variety of market segments, its low power consumption makes it particularly attractive for the consumer electronics and portable communications market segments.
Abstract: The need for power efficient embedded controllers became obvious when the explosion in portable communications and consumer segments began. While the IBM PowerPC TM 4xx 1 embedded controller family addresses a wide variety of market segments, its low power consumption makes it particularly attractive for the consumer electronics and portable communications market segments. While low power consumption is most often thought of as being important to extend battery life, there are other advantages of using a power-efficient embedded controller:

60 citations


Proceedings ArticleDOI
27 Jun 1995

60 citations


Proceedings ArticleDOI
D. Levitan1, T. Thomas, P. Tu
05 Mar 1995
TL;DR: The PowerPC 620 RISC microprocessor is the first chip for the application server and technical workstation product line within the PowerPC family and utilizes a high performance microarchitecture with many advanced superscalar features to exploit instruction level parallelism.
Abstract: The PowerPC 620 RISC microprocessor is the first chip for the application server and technical workstation product line within the PowerPC family. It utilizes a high performance microarchitecture with many advanced superscalar features to exploit instruction level parallelism. It is the first 64-bit implementation of the PowerPC architecture supporting both 32- and 64-bit application software, and is compatible with the PowerPC 601, PowerPC 603, and PowerPC 604 microprocessors.

58 citations


Proceedings ArticleDOI
01 May 1995
TL;DR: A performance simulator is developed using the VMW (visualization-based microarchitecture workbench) retargetable framework and detailed quantitative analyses of the effectiveness of all key microarchy features are presented.
Abstract: The PowerPC 620™ microprocessor is the most recent and performance leading member of the PowerPC™ family. The 64-bit PowerPC 620 microprocessor employs a two-phase branch prediction scheme, dynamic renaming for all the register files, distributed multi-entry reservation stations, true out-of-order execution by six execution units, and a completion buffer for ensuring precise exceptions. This paper presents an instruction-level performance evaluation of the 620 microarchitecture. A performance simulator is developed using the VMW (Visualization-based Microarchitecture Workbench) retargetable framework. The VMW-based simulator accurately models the microarchitecture down to the machine cycle level. Extensive trace-driven simulation is performed using the SPEC92 benchmarks. Detailed quantitative analyses of the effectiveness of all key microarchitecture features are presented.

57 citations


Proceedings ArticleDOI
02 Oct 1995
TL;DR: It is shown that a tight integration of the verification approach into the overall design methodology allows the formal verification of complex microprocessor implementations without compromising the design process or performance of the resulting system.
Abstract: This paper presents the use of formal methods in the design of a PowerPC microprocessor. The chosen methodology employs two independently developed design views, a register-transfer level specification for efficient system simulation and a transistor level implementation geared toward maximal processor performance. A BDD-based verification tool is used to functionally compare the two views which essentially validates the transistor-level implementation with respect to any functional simulation/verification performed at the register-transfer level. We show that a tight integration of the verification approach into the overall design methodology allows the formal verification of complex microprocessor implementations without compromising the design process or performance of the resulting system.

53 citations


Proceedings ArticleDOI
G.B. Kromann1, D. Gerke1, W. Huang1
21 May 1995
TL;DR: In this article, the authors introduce the C4/CBGA interconnect technology and address the following: (1) the PCB land definition and board preparation requirements, (2) the ball-grid-array to board assembly methods, (3) the electrical design considerations, (4) the heat transfer mechanism and thermal control options, and (5) the CBGA-to-PCB testing and reliability.
Abstract: The Motorola PowerPC 603 and PowerPC 604 microprocessors are available in the 21 mm controlled-collapsed-chip-connection/ceramic-ball-grid-array single-chip package (C4/CBGA). This paper will introduce the C4/CBGA interconnect technology and address the following: (1) the PCB land definition and board preparation requirements, (2) the ball-grid-array to board assembly methods, (3) the electrical design considerations, (4) the heat transfer mechanism and thermal control options, and (5) the CBGA-to-PCB testing and reliability.

30 citations


Proceedings ArticleDOI
D. Lewin1, L. Fournier1, M. Levinger1, E. Roytman1, G. Shurek1 
28 Mar 1995
TL;DR: A framework, and an algorithm that has been implemented in the Model-Based Test-Generator are described, which allows flexibility in modeling new addressing modes with which memory accesses are generated.
Abstract: A central problem in automatic test generation is solving constraints for memory access generation. A framework, and an algorithm that has been implemented in the Model-Based Test-Generator are described. This generic algorithm allows flexibility in modeling new addressing modes with which memory accesses are generated. The algorithm currently handles address constraint satisfaction for complex addressing modes in the PowerPC, x86, and other architectures. >

Patent
Paul Borrill1
20 Oct 1995
TL;DR: The first stateless multiplatform instruction set architecture (ISA) as discussed by the authors uses a very long instruction word (VLIW) architecture with 64-bit instructions, of which several highorder bits are reserved for an ISA identifier tag.
Abstract: A method and apparatus for providing a stateless multiplatform instruction set architecture (ISA) for use in a computer system having a processor and memory storing a control program for implementing the invention. The system is used to statelessly execute instructions authored to correspond to a variety of different ISA's on a unitary platform. The ISA of the invention uses a very long instruction word (VLIW) architecture with 64-bit instructions, of which several high-order bits are reserved for an ISA identifier tag. When the processor receives an instruction for execution, it inspects the instruction to determine from the ISA identifier tag to which original, native ISA the instruction corresponds. If the corresponding ISA is the native VLIW ISA for the processor, then the instruction is routed to the instruction dispatch unit of the processor, and thence to at least one functional unit for execution. If the corresponding ISA is not the native VLIW ISA, then the instruction is routed to one of a plurality of dynamic decode units (DDU's), each DDU being controlled by a translation routine that translates the instructions from a non-native ISA to the native VLIW ISA. The translated instructions are then sent to the instruction dispatch unit, and on to the appropriate functional unit(s). Any instruction that includes unused bits, such as 64-bit instructions with free higher-order bits, can accommodate the ISA identifier tag by simply using the unused bits. Instructions that do not include unused bits, such as 32-bit instructions for non-VLIW architectures (e.g. the ISA's for SPARC, PowerPC or x86), are appended with additional bits to bring the total to 64 bits, several of which are reserved for the ISA tag. The number of bits reserved for the ISA tag determines the number of non-native ISA's that are recognized by the system; e.g., three bits allows for the native ISA plus seven non-native ISA's to be recognized by the system. Incoming instructions corresponding to a non-native ISA for which no dynamic decode unit is available can be executed by conventional software emulation. Entire programs written for non-native ISA's (using, e.g., 32-bit instructions) can be converted to the format for the native VLIW ISA by appending, at the instruction loading stage or in a separate process independent of execution, the additional bits necessary both to fill out the instruction word lengths and to include the ISA identifier tag bits.

Proceedings ArticleDOI
01 Jan 1995
TL;DR: The PowerPC logic verification methodology is a general purpose approach suitable for a large class of chip designs that can exceed five million transistors in size and has been demonstrated by realizing three PowerPC microprocessor chips that were functional the first time.
Abstract: The PowerPC logic verification methodology is a general purpose approach suitable for a large class of chip designs that can exceed five million transistors in size. Several validation techniques are integrated into an automated logic verification strategy. The success of this methodology has been demonstrated by realizing three PowerPC microprocessor chips that were functional the first time.

Proceedings ArticleDOI
27 Jun 1995
TL;DR: A new approach to microarchitecture validation that adopts a paradigm analogous to that of automatic test pattern generation (ATPG) for digital logic testing is presented, which can achieve higher sequences coverage in fewer cycles than adhoc approaches.
Abstract: The paper presents a new approach to microarchitecture validation that adopts a paradigm analogous to that of automatic test pattern generation (ATPG) for digital logic testing. In this approach, the microarchitecture is rigorously specified in a set of machine description files. Based on these files, all possible pipeline hazards can be systematically identified Using this hazard list (analogous to a fault list for ATPG), specific sequences of instructions (analogous to test patterns) are automatically generated and constitute the test program. The execution of this test program validates the correct detection and resolution of all interinstruction dependences by the microarchitecture's pipeline interlock mechanism. Actual software tools have been developed for the automatic construction of the hazard list and the automatic generation of the test sequences. These explicitly generated can achieve higher sequences coverage in fewer cycles than adhoc approaches. 100% coverage of the hazard list can be ensured. These tools have been applied to four contemporary superscalar processors, namely the Alpha AXP 21064 and 21164 microprocessors, and the PowerPC 601 and 620 microprocessors. >

Proceedings ArticleDOI
Jen-Tien Yen1, M. Sullivan1, C. Montemayor, P. Wilson, R. Evers 
21 Oct 1995
TL;DR: A comprehensive approach which includes both creation of the simulation environment and generation of the test cases is presented for multiprocessor verification of PowerPC 620 at the functional model level.
Abstract: A comprehensive approach which includes both creation of the simulation environment and generation of the test cases is presented for multiprocessor verification of PowerPC 620 at the functional model level.

Proceedings ArticleDOI
02 Oct 1995
TL;DR: A novel hardware-driven data prefetching scheme, called the Instruction Opcode-Based Prefetching (IOBP), is proposed and the simulation shows that this IOBP scheme is very effective in reducing processor stall time due to memory accesses, especially for array or pointer references with constant strides.
Abstract: In the latest processor architectures such as IBM PowerPC and HP Precision Architecture (PA), it is found that certain important compound opcodes such as LOAD-UPDATE and LOAD-MODIFY contain accurate information about how data will be referenced in the near future. Furthermore, these opcodes have been fully utilized by the compiler in the program code generation. With the migration of data cache onto the processor chip, it is now possible for the on-chip cache controller to perform intelligent data prefetching based on the information from the instruction decode unit. In this paper, a novel hardware-driven data prefetching scheme, called the Instruction Opcode-Based Prefetching (IOBP), is proposed. Our simulation shows that this IOBP scheme is very effective in reducing processor stall time due to memory accesses, especially for array or pointer references with constant strides.

Proceedings ArticleDOI
02 Oct 1995
TL;DR: The PowerPC 604 uP provides a wealth of very advanced features for analyzing system hardware, software, and symmetric multiprocessor systems and these capabilities are becoming indispensable as more function is moved from the system boards to the microprocessors.
Abstract: Performance monitors (PM) have been traditionally viewed as hardware luxuries only available to large/multichip processors. This perception is quickly changing thanks to the incorporation of monitoring instrumentation in most of the current high-volume microprocessors used in PCs and workstations. The PowerPC 604 uP has raised the standard of excellence in this area. It provides a wealth of very advanced features for analyzing system hardware, software, and symmetric multiprocessor systems. These capabilities are becoming indispensable as more function is moved from the system boards to the microprocessors. Furthermore, the PowerPC 604 is enhancing the effort of porting software between various architectures. Software vendors to system architects are currently taking advantage of these PowerPC 604 performance monitor capabilities with great success. Some of these companies include IBM, Apple, Motorola, Groupe Bull, and Microsoft among others.

Proceedings ArticleDOI
15 Feb 1995
TL;DR: This superscalar microprocessor is the first 64 b implementation of the PowerPC architecture and delivers balanced performance suitable for high-end workstations and servers.
Abstract: This superscalar microprocessor is the first 64 b implementation of the PowerPC architecture. With estimated performance levels of 225 SPECint92 and 300 SPECfp92 at a nominal processor frequency of 133 MHz and a 4ML2 operating at 67 MHz, this processor delivers balanced performance suitable for high-end workstations and servers. The chip is realized in n-well 0.5 /spl mu/m CMOS with p-epi on a p/sup +/ substrate. There are four layers of metallization. The processor contains 6.88M transistors and dissipates an estimated 30 W at 133 MHz from a 3.3 V power supply. The 18.2/spl times/17.1 mm/sup 2/ die is packaged in a 25/spl times/25 ball grid array.

Proceedings ArticleDOI
G.Z.N. Cai1
28 Mar 1995
TL;DR: The architecture and multiprocessor verification for the Power PC 604 data cache systematically checks the data cache architecture, logic, and implementation correctness and provides the assurance that the PowerPC 604 microprocessor's aggressive hardware and software implementation is carried out correctly in the uniprocessors and multip rocessor environment.
Abstract: The PowerPC 604 microprocessor has high performance 32-bit implementation, which is optimized to produce compact code while adhering to RISC philosophy. The PowerPC 604 microprocessor can sustain a maximum issue rate of 4 instructions per cycle. The data cache of the 604 is a 16 KB four-way set-associative non-blocking cache which contains MESI states (M: Modified, E: Exclusive-unmodified, S: Shared, I: Invalid), a reservation bit with its reservation address register, an independent snoop port, WIMG (W: cache write policy, I: cacheability, M: coherency mode, G: protection against speculative access) support logic, and parity bits. The 604 has an on-chip phase-locked loop to provide different Processor/Bus clock ratios to simplify the system design while using a 100 MHz processor clock. The data cache to BIU (Bus Interface Unit) interface can handle different Processor/Bus clock ratios. The architecture and multiprocessor verification for the PowerPC 604 data cache systematically checks the data cache architecture, logic, and implementation correctness and provides the assurance that the PowerPC 604 microprocessor's aggressive hardware and software implementation is carried out correctly in the uniprocessor and multiprocessor environment. >

Proceedings ArticleDOI
C. Montemayor1, M. Sullivan, Jen-Tien Yen, P. Wilson, R. Evers 
02 Oct 1995
TL;DR: In creating SCPG, the design complexity and frequent design changes were dealt with by abstracting areas of concern as simple languages, writing tools to generate tests, and executing these in the standard verification environment.
Abstract: Multiprocessor design verification for the PowerPC 620 microprocessor was challenging due to the 620 Bus protocol complexity. The highly concurrent bus and level 2 (LS) cache interfaces, and the extensive system configurability. In order to verify this functionality, a combination of random and deterministic approaches were used. The Random Test Program Generator (RTPG) and the newly developed Stochastic Concurrent Program Generator (SCPG) tools were used for random verification. In the deterministic front, testcases in C were written to verify specific scenarios. In creating SCPG, we dealt with the design complexity and frequent design changes by abstracting areas of concern as simple languages, writing tools to generate tests, and executing these in the standard verification environment. The added value of these tests is that they exercise true data sharing among processors, are self-checking and resemble commercial multiprocessor code.

Proceedings ArticleDOI
05 Mar 1995
TL;DR: Support for the operation of PowerPC symmetric multiprocessing systems was introduced with Version 4 of the AIX operating system and its evolution from the uniprocessor Version 3 implementation is described.
Abstract: Support for the operation of PowerPC symmetric multiprocessing systems was introduced with Version 4 of the AIX operating system. This paper describes its evolution from the uniprocessor Version 3 implementation. It also discusses the kernel changes to support threads which allow applications to exploit the inherent parallelism of SMP.

Proceedings Article
16 Jan 1995
TL;DR: This paper describes those MP features that Bull and IBM together introduced into the AIX operating system to support the Symmetric Multiprocessor machine marketed by Bull under the Escala name and by IBM under the RS/6000 Models G30, J30 and R30 names.
Abstract: This paper describes those MP features that Bull and IBM together introduced into the AIX operating system to support the Symmetric Multiprocessor machine marketed by Bull under the Escala name and by IBM under the RS/6000 Models G30, J30 and R30 names. The PowerPC architecture and the AIX operating system present some specific challenges. We present the major problems encountered and how they were solved.

Proceedings ArticleDOI
C. Pyron1, W.C. Bruce
21 Oct 1995
TL;DR: Boundary scan problems and solutions that arose during implementation of Standard features and additional private instructions are discussed and selected IEEE Standard 1149.1 implementation issues are described.
Abstract: Selected IEEE Standard 1149.1 implementation issues are described for four PowerPC family devices. Boundary scan problems and solutions that arose during implementation of Standard features and additional private instructions are discussed.

Proceedings ArticleDOI
02 Oct 1995
TL;DR: The PowerPC 620 microprocessor introduces a new integrated secondary cache controller and system bus interface that support the snoop-based MESI cache coherency protocol and direct cache-to-cache data transfers.
Abstract: The PowerPC 620 microprocessor introduces a new integrated secondary cache controller and system bus interface. The secondary cache interface is 128 bits wide, supports L2 sizes from 1 MB to 128 MB, is ECC protected, can transfer 2.0 GB/sec at 133 MHz and supports an optional co-processor mode. The 620 bus is optimized for server-class systems requiring significant multiprocessing capability and supports the 64-bit PowerPC architecture with a 40-bit physical address bus and a separate 128-bit data bus. Address transfer rates of up to 33 M Addresses/sec at 66 MHz are achieved by pipelining the address snoop response with the address bus. The address and data buses are explicitly tagged allowing data transfers to be reordered with respect to the addresses. The data bus can transfer up to 1.0 GB/sec at 66 MHz. The bus protocol and the integrated L2 controller presented support the snoop-based MESI cache coherency protocol and direct cache-to-cache data transfers.

Patent
15 Sep 1995
TL;DR: In this paper, a PowerPC based NOSLM is concatenated onto an Intel-based NOS LM and offsets are adjusted to account for the size of the NLM.
Abstract: A PowerPC based Network Operating System Loadable Module (NOSLM) is concatenated onto an Intel-based NOSLM and offsets are adjusted to account for the size of the Intel-based NOSLM. The resulting enlarged NOSLM appears as a typical Intel-based NOSLM to Intel-based servers. When the enlarged NOSLM is loaded by PowerPC-based servers, the offsets are used to point the server to the beginning of the PowerPC-based NOSLM code and the Intel-based NOSLM is interpreted as a machine-specific header.

Journal ArticleDOI
TL;DR: From the 1956 IBM 7030 (Stretch) to today's PowerPC, this work presents queue configurations and prefetch strategies along with the design decisions that led to their final architectures.
Abstract: For several decades, designers have used queues to resolve two processor-memory interface problems - long latency and low bandwidth. Here, we discuss the evolution of instruction and branch target queues. We also explore their use to support variable-length instructions and reduce misalignment problems. From the 1956 IBM 7030 (Stretch) to today's PowerPC, we present queue configurations and prefetch strategies along with the design decisions that led to their final architectures. >

Proceedings ArticleDOI
21 May 1995
TL;DR: In this article, transient energy management strategies via examining chip-on-substrate geometry and evaluating a transient thermal management case study on a PowerPC based model notebook computer are discussed.
Abstract: Transient energy management strategies are introduced via examining chip-on-substrate geometry and evaluating a transient thermal management case study on a PowerPC based model notebook computer. Cascaded frequency reduction, periodic heating and workload shifting techniques for dynamically controlling chip junction temperature are discussed. The model notebook computer case study indicates that it is possible to improve notebook computer performance dramatically by using high end processor and transient thermal storage cooling techniques.

Book
01 Apr 1995
TL;DR: The author provides a complete description of the specification for both the 32- and 64-bit implementations, including: supervisor privilege level facilities logical memory addresses I/O and memory-mapped I-O address translation for segments, pages, and blocks virtual paging interrupts.
Abstract: From the Publisher: PowerPC System Architecture describes the hardware architecture of PowerPC systems, providing a clear, concise explanation of the PowerPC specification, the template upon which all PowerPC processors are designed. The author provides a complete description of the specification for both the 32- and 64-bit implementations, including: supervisor privilege level facilities logical memory addresses I/O and memory-mapped I/O address translation for segments, pages, and blocks virtual paging interrupts The second half of the book examines the PowerPC 601 processor as an example, exploring how the processor fits, adheres to, and deviates from the PowerPC processor specification. In addition, a detailed discussion of the bus structure and transaction protocol used by the 60x processors is provided. If you design or test hardware or software that involves PowerPC systems, PowerPC System Architecture is an essential, time-saving tool.

Proceedings ArticleDOI
05 Mar 1995
TL;DR: This paper compares and contrasts the 32-bit subset specification against the full 64-bit PowerPC Architecture specification, and uses the PowerPC 620 microprocessor implementation as a vehicle when examining the 64- bit features.
Abstract: This paper details the 64-bit PowerPC Architecture specification. It compares and contrasts the 32-bit subset specification against the full 64-bit specification. Architecture, application OS, and hardware implications of the 64-bit specifications are all explored in detail. In addition, 32- and 64-bit compatibility and OS migration strategies are described. The PowerPC 620 microprocessor implementation is used as a vehicle when examining the 64-bit features. The 620's MMU is described, and potential performance implications are discussed.

Journal Article
TL;DR: A 3.3 V Phase-Locked-Loop clock synthesizer implemented in 0.5 μm CMOS technology is described, which supports internal to external clock frequency ratios of 1, 1.5, 2, 3, and 4 as well as numerous static power down modes for PowerPC microprocessors.
Abstract: A 3.3 V Phase-Locked-Loop (PLL) clock synthesizer implemented in 0.5 μm CMOS technology is described. The PLL support internal to external clock frequency ratios of 1, 1.5, 2, 3, and 4 as well as numerous static power down modes for PowerPC microprocessors. The CPU clock lock range spans from 6 to 175 MHz. Lock times below 15 μs, PLL power dissipation below 10 mW as well as phase error and jitter below ±100 ps have been measured. The total area of the PLL is 0.52 mm 2