scispace - formally typeset
Search or ask a question

Showing papers on "Multi-core processor published in 1991"


Patent
18 Oct 1991
TL;DR: In this article, a high performance graphics applications controller having a core processor and a coprocessor to independently perform desired graphics functions is provided, where a direct memory access (DMA) controller cooperates with the coprosor to generate source and destination addresses and employs a unique set of commands to speed operation.
Abstract: A high performance graphics applications controller having a core processor and a coprocessor to independently perform desired graphics functions is provided. The core processor and the coprocessor divides processing tasks to speed execution and to reduce the burden on the host CPU. A direct memory access (DMA) controller cooperates with the coprocessor to generate source and destination addresses and employs a unique set of commands to speed operation. The core processor employs a local CPU and data and address catches to locally perform desired graphics operations independently but in conjunction with the coprocessor. The present invention has particular application with smart terminals and wherever pixel oriented data is required.

105 citations


Proceedings ArticleDOI
14 Oct 1991
TL;DR: A method is presented for improving the performance of many computationally intensive tasks by extracting information at compile-time to synthesize new operations that augment the functionality of a core processor.
Abstract: Substantial gains can be achieved by allowing the configuration and fundamental operations of a processor to adapt to a user's program. A method is presented for improving the performance of many computationally intensive tasks by extracting information at compile-time to synthesize new operations that augment the functionality of a core processor. The newly synthesized operations are targeted to RAM-based reconfigurable logic located within the processor. A proof-of-concept system called PLADO, consisting of a C configuration compiler and a hardware platform, is presented. Computation and performance results confirm the concept viability, and demonstrate significant speed-up. >

53 citations


Journal ArticleDOI
M. Atkins1
TL;DR: The internal design of the i860 CPU, which exploits pipelining and parallelism more than previous microprocessors, is described, and other innovations include simultaneous floating-point operations similar to digital signal processing, a two-instruction-per-clock mode, fast floating- point pipelines graphics instructions, and high-bandwidth registers and caches on-chip.
Abstract: The internal design of the i860 CPU, which exploits pipelining and parallelism more than previous microprocessors, is described. The i860 uses RISC concepts and memory-performance optimizations in several novel ways. Other innovations include simultaneous floating-point operations similar to digital signal processing, a two-instruction-per-clock mode, fast floating-point pipelines graphics instructions, and high-bandwidth registers and caches on-chip. These features make it one of the fastest single-chip processors available. >

43 citations


Book ChapterDOI
01 Jan 1991
TL;DR: This chapter discusses the next generation of the complex instruction set computer (CISC) processor family, the MC68040, which incorporates separate integer and floating-point units, giving sustained performances of 20 integer MIPS and 3.5 double precision Linpack MFLOPS.
Abstract: This chapter discusses the next generation of the complex instruction set computer (CISC) processor family. The MC68040 is the latest member. Fabricated in 0.8 um HCMOS technology and using 1.2 million transistor sites, it incorporates separate integer and floating-point units, giving sustained performances of 20 integer MIPS and 3.5 double precision Linpack MFLOPS, dual 4 Kbyte instruction and data caches, dual memory management units, and an extremely sophisticated bus interface unit. Revolutionary rather than evolutionary, the design takes the ideas of overlapping instruction execution and pipelining to a new level for CISC processors. The instruction execution unit consists of a six-stage pipeline that sequentially fetches an instruction, decodes it, calculates the effective address, fetches an address operands, executes the instruction, and finally writes back the results. The MC68040 has two memory management units, one allocated for data and the other for instructions. This family of integrated MC68000 processors bridges the gap between 8-bit microcontrollers and the 16/32-bit M68000 families. The idea was simple, that is, take an MC68000 family processor core, add all the interface logic needed for chip select, some on chip memory, and some intelligent peripherals controlled by another dedicated processor. This type of configuration is almost identical to that of an MC68HC11 MCU except that processing power is greatly increased. Two members of the family are available, with many more to come.

23 citations


Patent
20 Mar 1991
TL;DR: Scalable compound instruction set machine as mentioned in this paper is a method for processing a set of instructions or programs to be executed by a computer to determine statically which instructions may be combined into compound instructions which are executed in parallel by a scalar machine.
Abstract: Scalable compound instruction set machine and method which provides for processing a set of instructions or program to be executed by a computer to determine statically which instructions may be combined into compound instructions which are executed in parallel by a scalar machine. Such processing looks for classes of instructions that can be executed in parallel without data-dependent or hardware-dependent interlocks. Without regard to their original sequence the individual instructions are combined with one or more other individual instructions to form a compound instruction which eliminates interlocks. Control information is appended to identify information relevant to the execution of the compound instructions. The result is a stream of scalar instructions compounded or grouped together before instruction decode time so that they are already flagged and identified for selective simultaneous parallel execution by execution units. The compounding does not change the object code results and existing programs realize performance improvements while maintaining compatibility with previously implemented systems for which the original set of instructions was provided.

20 citations


Proceedings ArticleDOI
14 Oct 1991
TL;DR: The methods used in the verification of a MIPS-1 architecture-compatible embedded control processor are described and the transfer of fully functional rev A silicon to production demonstrated the success of this methodology.
Abstract: The methods used in the verification of a MIPS-1 architecture-compatible embedded control processor are described. This single-chip processor contains 700000 transistors, operates at 50 MHz, and consists of a CPU core, 8 kB of instruction cache, 1 kB of data cache, a DRAM controller, a write buffer, three timers, and a bus interface unit (BIU). Individual module testing and integrated system testing were the two methods used for verification. Integrated system simulation included architectural, functional, and random instruction testing using behavioral simulation test environments. These techniques provided a comprehensive and effective testing environment. The transfer of fully functional rev A silicon to production demonstrated the success of this methodology. >

10 citations


Proceedings ArticleDOI
14 Oct 1991
TL;DR: The development of an experimental high-performance microprocessor chip based on a 0.3- mu m BiCMOS technology is discussed, which includes a four-wave interleaved secondary cache assessed in parallel according to a split-bus protocol, to reduce shared memory conflicts.
Abstract: The development of an experimental high-performance microprocessor chip based on a 0.3- mu m BiCMOS technology is discussed. It is designed to operate at a 250-MHz clock rate. It includes two processors, each of which executes two instructions in parallel. The chip performs 1000 MIPS when instructions and data are fetched from primary caches. It also includes a four-wave interleaved secondary cache assessed in parallel according to a split-bus protocol, to reduce shared memory conflicts. The VLSI architecture and design results of this chip are described. >

7 citations


Book ChapterDOI
01 Jan 1991
TL;DR: An analog integrated circuit cellular neural network implementation having digitally or continuously selectable template coefficients has been designed providing a simple dual (analog and digital) computing structure.
Abstract: An analog integrated circuit cellular neural network implementation having digitally or continuously selectable template coefficients has been designed. Local logic and memory is added into each cell providing a simple dual (analog and digital) computing structure. The variable-gain OTA is used as a voltage controlled current source to implement programmability of the template elements. A 4-by-4 CNN core processor is realized using the 2um analog CMOS-process. A separate digital control chip is designed to have a microcontroller, program- and data-memory, and to provide communication with the host computer. Together the analog CNN-core processor chip and the digital control chip function also as a stand-alone processing unit.

2 citations