scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Micro in 1993"


Journal Article•DOI•
TL;DR: The authors consider whether SPECmarks, the figures of merit obtained from running the SPEC benchmarks under certain specified conditions, accurately indicate the performance to be expected from real, live work loads, and it is found that instruction cache miss ratios in general, and data cache miss ratio for the integer benchmarks, are quite low.
Abstract: The authors consider whether SPECmarks, the figures of merit obtained from running the SPEC benchmarks under certain specified conditions, accurately indicate the performance to be expected from real, live work loads. Miss ratios for the entire set of SPEC92 benchmarks are measured. It is found that instruction cache miss ratios in general, and data cache miss ratios for the integer benchmarks, are quite low. Data cache miss ratios for the floating-point benchmarks are more in line with published measurements for real work loads. >

184 citations


Journal Article•DOI•
TL;DR: The design of the Sparcle chip, which incorporates mechanisms required for massively parallel systems in a Sparc RISC core, is described and its fine-grain computation, memory latency tolerance, and efficient message interface are discussed.
Abstract: The design of the Sparcle chip, which incorporates mechanisms required for massively parallel systems in a Sparc RISC core, is described. Coupled with a communications and memory management chip (CMMU) Sparcle allows a fast, 14-cycle context switch, an 8-cycle user-level message send, and fine-grain full/empty-bit synchronization. Sparcle's fine-grain computation, memory latency tolerance, and efficient message interface are discussed. The implementation of Sparcle as a CPU for the Alewife machine is described. >

170 citations


Journal Article•DOI•
T. Asprey1, G.S. Averill1, E. DeLano1, R. Mason1, B. Weiner1, J. Yetter1 •
TL;DR: The PA7100 CPU, the first precision-architecture, reduced-instruction-set-computer (PA-RISC) architecture implementation to combine an integer core and floating-point coprocessor into a single-chip format, is described.
Abstract: The PA7100 CPU, the first precision-architecture, reduced-instruction-set-computer (PA-RISC) architecture implementation to combine an integer core and floating-point coprocessor into a single-chip format, is described. It incorporates superscalar execution and supports clock rates of up to 100 MHz in standard 0.8- mu m CMOS. Features such as a flexible primary cache organization and multiprocessing capability allow the device to be scaled to a variety of system applications, price ranges, and performance levels. The microprocessor instruction execution pipeline, cache design, translation look-aside buffer (TLB) for virtual address translation, floating-point unit, and system interface bus are discussed. The design, test, and verification methods used in the development of the PA7100 are reviewed. >

113 citations


Journal Article•DOI•
D. Alpert1, D. Avnon1•
TL;DR: The techniques of pipelining, superscalar execution, and branch prediction used in the Pentium CPU, which integrates 3.1 million transistors in 0.8-mu m BiCMOS technology, are described in this article.
Abstract: The techniques of pipelining, superscalar execution, and branch prediction used in the Pentium CPU, which integrates 3.1 million transistors in 0.8- mu m BiCMOS technology, are described. The technology improvements associated with the three most recent microprocessor generations are outlined. The Pentium's compatibility, performance, organization, and development process are also described. The compiler technology developed with the Pentium microprocessor, which includes machine-independent optimizations common to current high-performance compilers, such as inlining, unrolling, and other loop transformations, is reviewed. >

101 citations


Journal Article•DOI•
TL;DR: The Alpha AXP 64-b architecture, which forms the basis for a series of high-performance computer systems, is described and performance measurement results for a variety of commonly used benchmarks under both OpenVMS AXP V1 and DEC OSF/1 V1.2 are presented.
Abstract: The Alpha AXP 64-b architecture, which forms the basis for a series of high-performance computer systems, is described. The implementation of this architecture in the 21064 microprocessor is discussed. This 1.4-cm*1.7-cm CMOS chip incorporates 1.68 million transistors using a 0.75- mu m, three-metal process. Performance measurement results for a variety of commonly used benchmarks under both OpenVMS AXP V1 and DEC OSF/1 V1.2 are presented. >

89 citations


Journal Article•DOI•
TL;DR: The PowerPC 601 microprocessor, the first of a family of processors based on the PowerPC architecture, is described, which contains a 32-Kb cache and a superscalar machine organization that allows dispatch and execution of up to three instructions each clock cycle.
Abstract: The PowerPC 601 microprocessor, the first of a family of processors based on the PowerPC architecture, is described. The general-purpose processor contains a 32-Kb cache and a superscalar machine organization that allows dispatch and execution of up to three instructions each clock cycle. The bus interface and storage control mechanisms can be configured for a wide range of system designs, from low-cost desktop personal computers to high-performance multi-processor systems. The PowerPC architecture, machine organization, chip packaging technology, and performance are discussed. >

63 citations


Journal Article•DOI•
Kazuo Nakamura1, Narumi Sakashita1, Yasuhiko Nitta1, K. Shimomura1, T. Tokuda1 •
TL;DR: Fuzzy inference, a data processing method based on the fuzzy theory that has found wide use in the control field, is reviewed and a fuzzy inference date processor that operates at 200000 fuzzy logic inferences per second is described.
Abstract: Fuzzy inference, a data processing method based on the fuzzy theory that has found wide use in the control field, is reviewed. Consumer electronics, which accounts for most current applications of this concept, does not require very high speeds. Although software running on a conventional microprocessor can perform these inferences, high-speed control applications require much greater speeds. A fuzzy inference date processor that operates at 200000 fuzzy logic inferences per second and features 12-b input and 16-b output resolution is described. >

62 citations


Journal Article•DOI•
TL;DR: It is shown that the implementation proves the functionality and indicates the system performance of a highly dynamic range camera is feasible.
Abstract: The performance and architecture of a high dynamic range camera (HDRC) chip and the conceptional advantages for its adaptation to image processing systems in traffic environments are discussed. The HDRC chip was developed with 64*64 pixels using a standard digital 1.2- mu m CMOS technology. It is shown that the implementation proves the functionality and indicates the system performance of a highly dynamic range camera is feasible. >

59 citations


Journal Article•DOI•
U. Palmquist1•
TL;DR: An onboard autonomous intelligent cruise control system which controls a vehicle's speed according to the driver's desire and the speed of and distance to the preceding vehicle is discussed.
Abstract: An onboard autonomous intelligent cruise control system which controls a vehicle's speed according to the driver's desire and the speed of and distance to the preceding vehicle is discussed. The system offers a one-directional short-range system for vehicle-vehicle and roadside-vehicle communication and considerations for recommended speed, limits, and traffic signals. It is a potential key element in linking and integrating the driver-vehicle-infrastructure in future intelligent transportation systems. Two field trials undertaken to determine the feasibility of the system are described. >

56 citations


Journal Article•DOI•
TL;DR: The technical issues that the Advanced Vehicle Control Systems (AVCS) Committee of the Intelligent Vehicle Highway Society (IVHS) of America has identified as necessary to improve the performance of the surface transportation system are discussed.
Abstract: The technical issues that the Advanced Vehicle Control Systems (AVCS) Committee of the Intelligent Vehicle Highway Society (IVHS) of America has identified as necessary to improve the performance of the surface transportation system are discussed. AVCSs represent the application of sensors, computers, and electromechanical actuators to provide drivers with warnings of hazards, assistance in controlling their vehicles, or fully automated control of vehicle motions. The constraints under which AVCS enabling technologies must be brought to maturity, such as cost, reliability, fault tolerance, and environmental hardening, combined with the basic performance requirements, are outlined. The enabling technologies for AVCS, subdivided into categories of sensors, communication, computation, electromechanical actuators, software and systems technologies, and special tools and facilities, are discussed. AVCS target products are reviewed. >

55 citations


Journal Article•DOI•
TL;DR: Approaches considered are secure electronic mail, secure communications, directory authentication and network management, banking, and escrowed encryption.
Abstract: The author reviews encryption algorithms and standards, how they compare, how they differ, and where they are headed. Attention is given to secret-key cryptosystems, public-key cryptosystems, digital signature schemes, key-agreement algorithms, cryptographic hash functions, and authentication codes. Applications considered are secure electronic mail, secure communications, directory authentication and network management, banking, and escrowed encryption. >

Journal Article•DOI•
M. Awaga1, H. Takahashi•
TL;DR: The architecture and design of the mu VP, a single-chip vector coprocessor developed to meet the needs of high-performance processors, are described.
Abstract: The architecture and design of the mu VP, a single-chip vector coprocessor developed to meet the needs of high-performance processors, are described. The mu VP is a supercomputer component implemented on a single large-scale-integrated (LSI) CMOS chip. With 206 MFLOPS single-precision and 106-MFLOPS double-precision performance at 50 MHz, the mu VP offers a rate almost equivalent to that typical minisupercomputers. >

Journal Article•DOI•
TL;DR: Simulations and practical tests confirm that a small-size feedforward autonomous neural network (21 neurons) can learn to steer a vehicle at high speeds only from looking at human-driving examples, including the nonlinear dynamics of the vehicle and the driver's individual driving style.
Abstract: A solution to autonomous lateral vehicle guidance using a neurocontroller that can learn from measured human-driving data without knowledge of the physical car parameters is discussed. Simulations and practical tests confirm that a small-size feedforward autonomous neural network (21 neurons) can learn to steer a vehicle at high speeds only from looking at human-driving examples. In this way, the network learns the total closed-loop behavior, including the nonlinear dynamics of the vehicle and the driver's individual driving style. The main result of practical investigations is that the neutral controller trained on human-driving examples exhibits an aperiodic behavior that does not vanish at higher speeds (tests performed up to 130 km/h) and produces fewer lateral deviations than the linear state controller. >

Journal Article•DOI•
TL;DR: The Tiny, CSN, Multiple Rings, and Ordered Dimensions, and interval labeling routing systems for transputer networks are reviewed and compared with respect to several criteria, such as adaptivity, deadlockfreedom, generality, livelock freedom, and network latency.
Abstract: The Tiny, CSN, Multiple Rings, and Ordered Dimensions, and interval labeling routing systems for transputer networks are reviewed The systems are compared with respect to several criteria, such as adaptivity, deadlock freedom, generality, livelock freedom, and network latency >

Journal Article•DOI•
TL;DR: The Gmicro/500, which features a RISC-like dual-pipeline structure for high-speed execution of basic instructions and represents a significant advance for the TRON architecture, is presented.
Abstract: The Gmicro/500, which features a RISC-like dual-pipeline structure for high-speed execution of basic instructions and represents a significant advance for the TRON architecture, is presented. Upwardly-object-compatible with earlier members of the Gmicro series, this microprocessor uses resident dedicated branch buffers to greatly enhance branch instruction execution speed. Its microprograms simultaneously use dual execution blocks to execute high-level language instructions effectively. Fabricated with a 0.6- mu m CMOS technology on a 10.9-mm*16-mm die, the chip operates at 50/66 MHz and achieves a processing rate of 100/132 MIPS. >

Journal Article•DOI•
TL;DR: Four methods of providing precise interruptions with regard to performance degradation and cost of implementation are compared from the VLSI silicon resources perspective and provide valuable information for V LSI processor designers to consider if they include the precise interruption in their designs.
Abstract: Pipelining is an implementation technique that exploits parallelism among instructions. Imprecise interruption problems arise when a pipelined processor has multiple multicycle functional units because instruction completion might be out of order. An early issued, long-running instruction might generate an interruption after the completion of several short-running instructions issued later, resulting in an imprecise interruption. Four methods of providing precise interruptions with regard to performance degradation and cost of implementation are compared from the VLSI silicon resources perspective. Results provide valuable information for VLSI processor designers to consider if they include the precise interruption in their designs. The four methods are in-order instruction completion, reorder buffer, history file, and future file. >

Journal Article•DOI•
TL;DR: The characteristics, advantages, and potential applications of amalgam systems, which offer a promising alternative to traditional electronics solders for lower temperature processes and components, without the environmental drawbacks of most solder systems, are described.
Abstract: The characteristics, advantages, and potential applications of amalgam systems, which offer a promising alternative to traditional electronics solders for lower temperature (and hence lower cost) processes and components, without the environmental drawbacks of most solder systems, are described. Amalgams are nonequilibrium, mechanically alloyed materials formed at or near room temperature between a liquid metal and a powder. They offer exceptional thermal stability and superior joint strength and thermal cycle measurements. Gallium/nickel, gallium/copper, and gallium/copper/nickel amalgam alloys are discussed. The amalgamation methods, requirements for electronics amalgams, amalgam hardening mechanism, and amalgam wetting techniques are also discussed. Applications of amalgams to large die area attachments and flip chips are described. >

Journal Article•DOI•
TL;DR: Two processors that compete in the workstation/server markets are compared and the primary advantage for the Alpha design appears to be the high clock rate.
Abstract: Two processors that compete in the workstation/server markets are compared. The 62.5-MHz IBM RISC System/6000 Model 580 (RS1) exemplifies a moderate clock rate design. As the highest SPECmark89/MHz system it can be viewed as maximizing the work performed per cycle. the 133-/200-MHz DEC Alpha processor represents an aggressive clock rate design. At 200 MHz, the Alpha has the highest MHz rate in the market. The authors discuss clock rate goals, how they influence design choices, and performance implications. The primary advantage for the Alpha design appears to be the high clock rate. The RS1 design includes a significant amount of hardware to increase in superscalar capability, especially on floating-point codes. RS1 has a significant infinite cache CPI advantage on floating-point applications. Infinite cache CPI for the two designs seem comparable on fixed-point codes. >

Journal Article•DOI•
TL;DR: The activities of the European Prometheus project's PRO-CHIP research groups, which have studied the problem of improving automotive electronic reliability while respecting the specifications of low-cost, high-volume production, light weight, compactness, and short time-to-market imposed by the automotive industry, are reviewed.
Abstract: The activities of the European Prometheus project's PRO-CHIP research groups, which have studied the problem of improving automotive electronic reliability while respecting the specifications of low-cost, high-volume production, light weight, compactness, and short time-to-market imposed by the automotive industry, are reviewed. The reliability problems most frequently encountered by electronic devices for automotive applications, including failure mechanisms due to the package or to the assembling technology, different kinds of electrical overstress, electrostatic discharge, electromagnetic interference, breakdown and burnout of power devices, and failure mechanisms accelerated by high temperatures and high current densities, and the procedures the manufacturers use to evaluate and improve the reliability of their products are discussed. >

Journal Article•DOI•
TL;DR: It is interesting to look back today to reflect on what these entities set out to achieve, and to what degree they were successful, and in the course of doing so, the author offers some observations on how (and when) consortia of various types are best formed.
Abstract: It is pointed out that the role of de facto standard setting organizations is relevant and important to true de jure standard setting organizations such as ASC X3 and IEEE. On the one hand, as a practical matter, a de jure organization may opt not to develop a competing standard, if it is satisfied that the de facto organization is providing a useful standard. As a result, a de factor standard setting organization is effectively permitted to control an important area of technology. On the other hand, the endorsement or incorporation by de facto bodies of de jure standards augments the effectiveness of the latter standards. For these reasons, as well as the insights that these new consortia provide into the high stakes wars that have been fought in the high technology industry in recent years, it is interesting to look back today to reflect on what these entities set out to achieve, and to what degree they were successful. In the course of doing so, the author offers some observations on how (and when) consortia of various types are best formed and focuses on the structural and legal factors which should be addressed in structuring and operating a successful consortium. >

Journal Article•DOI•
TL;DR: Programmable multichip modules (PMCMs), in which fabricated generic substrates are customized to meet the application-specific needs of a user, are discussed.
Abstract: Programmable multichip modules (PMCMs), in which fabricated generic substrates are customized to meet the application-specific needs of a user, are discussed. The design principles of PMCMs are reviewed. The methods for programming fully programmable MCMs and semiprogrammable MCMs are described. >

Journal Article•DOI•
TL;DR: The trends in high density interconnection (HDI) multichip module (MCM) techniques that have the potential to reduce interconnection cost and production time are described and thin-film deposited dielectric (MCD) technology is discussed as a cost-effective method for future interconnection applications.
Abstract: The trends in high density interconnection (HDI) multichip module (MCM) techniques that have the potential to reduce interconnection cost and production time are described. The implementation in laminated dielectric (MCM-L) technology of a workstation processor core illustrates current substrate technology capabilities. The design, routing, layout and thermal management of the processor core are described. Thin-film deposited dielectric (MCM-D) technology is discussed as a cost-effective method for future interconnection applications. >

Journal Article•DOI•
TL;DR: The trends of virtual-reality research in Japan are reviewed and a prototype of a plant monitoring system that uses camera images to provide a sense of being there, and experiments on the tactile senses of touch and pressure, are described.
Abstract: The trends of virtual-reality research in Japan are reviewed. Research on the application of virtual reality systems in three-dimensional imaging of software structures, remote control of construction robots, and molecular model design is discussed. A prototype of a plant monitoring system that uses camera images to provide a sense of being there, and experiments on the tactile senses of touch and pressure, are described. The space interface device for artificial reality (SPIDAR) is also discussed. >

Journal Article•DOI•
TL;DR: Shifting register windows, a register windowing method that attempts to overcome some of the difficulties of traditional fixed- and variable-sized schemes, is described, using fewer register elements than a seven-window Sparc organization and has a very short register bus length.
Abstract: Shifting register windows, a register windowing method that attempts to overcome some of the difficulties of traditional fixed- and variable-sized schemes, is described. Using fewer register elements than a seven-window Sparc organization, shifting register windows more than halves spill/refill memory traffic and reduces visible spill/refill cycles by an order of magnitude. In addition, shifting register windows, a scheme based on fast hardware stack and register-memory dribbling, has a very short register bus length. It also zeros registers as they are being allocated, making common initialization unnecessary. >

Journal Article•DOI•
TL;DR: The author presents a synthesis of thinking within the economics field about network development and standardization, focusing on understanding economic factors shaping important contemporary events and the development of standards in tomorrow's information infrastructure.
Abstract: The author presents a synthesis of thinking within the economics field about network development and standardization. The analysis focuses on understanding economic factors shaping important contemporary events and the development of standards in tomorrow's information infrastructure. The motivation for such a synthesis is that many observations about market mechanisms are not consistent with one another, nor do they all transparently synthesize into a single policy vision. The key to understanding this confusion is that standards take on a dual role, as a coordinator and as a constraint. These insights should be useful for the development of appropriate public policy and management strategy. >

Journal Article•DOI•
TL;DR: In this paper, two processors that compete in the workstation/server markets are compared and it is shown that performance measurements on many systems support the initial claim that cycle time is not sufficient to determine performance.
Abstract: For part I, see ibid., vol.13, no.4, p.8-16 (1993). Two processors that compete in the workstation/server markets are compared. The 62.5-MHz IBM RISC System/6000 Model 580 exemplifies a moderate clock rate design. The 133-1200-MHz DEC Alpha processor represents an aggressive clock rate design. The performance implications of the memory subsystems and the effect of instruction sets on path length are described. It is shown that performance measurements on many systems support the initial claim that cycle time is not sufficient to determine performance. >

Journal Article•DOI•
TL;DR: Modeling results show that definite relations exist between connector pitch, length, substrate thickness and material, and stresses in a connector body.
Abstract: The stresses and strains in substrate-connector systems in multichip modules (MCMs) are discussed. Simultaneous stresses were applied as structural loads in combined stress models. Modeling results show that definite relations exist between connector pitch, length, substrate thickness and material, and stresses in a connector body. >

Journal Article•DOI•
TL;DR: Spearmints as mentioned in this paper is a system of hardware components that can be easily interfaced to the nodes of an instrumented distributed system for monitoring or evaluation using event-triggered measurements, each machine of the target system must have one sensor that collects relevant events and marks them with global time stamps.
Abstract: Spearmints, a system of hardware components that can be easily interfaced to the nodes of an instrumented distributed system for monitoring or evaluation using event-triggered measurements, is described. Each machine of the target system must have one sensor that collects relevant events and marks them with global time stamps. The sensors can be attached to a common measurement system that samples the marked events on- or offline, orders them chronologically, and analyzes the resulting sequence. The design of Spearmints is based on providing a simple and universal tool that causes little interference and furnishes highly accurate measurements in distributed systems. As Spearmints only requires standard interfaces with its integration into an object system and its connection to a measurement system, it permits the use of a wide range of measurement systems for the evaluation of a variety of distributed systems. >

Journal Article•DOI•
TL;DR: Simulation and layouts show that the proposed Theta -search associative memory chip consisting of 256 words, each 64-b long, can fit on a 13.5-mm*9-mm chip.
Abstract: The design of a high-capacity Theta -search associative memory ( Theta in ( , or=,= not=)) is presented. PSPICE simulation and layouts show that the proposed Theta -search associative memory chip consisting of 256 words, each 64-b long, can fit on a 13.5-mm*9.5-mm chip. It can perform maskable Theta -search operations over its contents in 110 ns. >

Journal Article•DOI•
TL;DR: The evolution of the ATM (asynchronous transfer mode) from its inception in early trials through to the current developments is reviewed, with strong emphasis placed on the role of standards development in the development of the technology.
Abstract: The evolution of the ATM (asynchronous transfer mode) from its inception in early trials through to the current developments is reviewed. Strong emphasis is placed on the role of standards development in the development of the technology, as it is the opinion of the author that ATM could never have reached the status that it has today without the development effort put into the standards over the period between 1986 and 1992. The standards development process has also been a learning process. Technical differences seem to get resolved more quickly with more interaction at informal workshops. The creation of the ATM Forum has taken ATM one step closer to the end user. However, the standards community is aware of a need to improve its process, to develop standards that are less ambiguous and less subject to error, and to ensure interoperability of different networks and different vendor products. The author notes that over the next few years one will see major changes in these areas to improve the product of standards development. >