scispace - formally typeset
Search or ask a question

Showing papers on "Field-programmable gate array published in 1995"


Proceedings ArticleDOI
15 Feb 1995
TL;DR: PathFinder as mentioned in this paper uses an iterative algorithm that converges to a solution in which all signals are routed while achieving close to the optimal performance allowed by the placement, which is achieved by forcing signals to negotiate for a resource and thereby determine which signal needs the resource most.
Abstract: Routing FPGAs is a challenging problem because of the relative scarcity of routing resources, both wires and connection points. This can lead either to slow implementations caused by long wiring paths that avoid congestion or a failure to route all signals. This paper presents PathFinder, a router that balances the goals of performance and routability. PathFinder uses an iterative algorithm that converges to a solution in which all signals are routed while achieving close to the optimal performance allowed by the placement. Routability is achieved by forcing signals to negotiate for a resource and thereby determine which signal needs the resource most. Delay is minimized by allowing the more critical signals a greater say in this negotiation. Because PathFinder requires only a directed graph to describe the architecture of routing resources, it adapts readily to a wide variety of FPGA architectures such as Triptych, Xilinx 3000 and mesh-connected arrays of FPGAs. The results of routing ISCAS benchmarks on the Triptych FPGA architecture show an average increase of only 4.5% in critical path delay over the optimum delay for a placement. Routes of ISCAS benchmarks on the Xilinx 3000 architecture show a greater completion rate than commercial tools, as well as 11% faster implementations.

706 citations


Patent
06 Feb 1995
TL;DR: In this article, a device independent, frequency driven layout system and method for field programmable gate arrays (FPGA) is presented, which allows a circuit designer to specify the desired operating frequencies of clock signals in a given design to the automatic layout system to generate, if possible, a physical FPGA layout which will allow the targeted FPGAs device to operate at the specified frequencies.
Abstract: A device independent, frequency driven layout system and method for field programmable gate arrays ("FPGA") which allow for a circuit designer to specify the desired operating frequencies of clock signals in a given design to the automatic layout system to generate, if possible, a physical FPGA layout which will allow the targeted FPGA device to operate at the specified frequencies. Actual net, path and skew requirements are automatically generated and fed to the place and route tools. The system and method of the present invention evaluates the frequency constraints, determines what delay ranges are acceptable for each electrical connection and targets those ranges throughout the layout.

211 citations


Journal ArticleDOI
TL;DR: Automatic mapping tools for Triptych, an FPGA architecture with improved logic density and performance over commercial FPGAs, and extensions to these algorithms for mapping asynchronous circuits to Montage, the first FGPA architecture to completely support asynchronous and synchronous interface applications are described.
Abstract: Field-programmable gate arrays (FPGAs) are becoming an increasingly important implementation medium for digital logic. One of the most important keys to using FPGAs effectively is a complete, automated software system for mapping onto the FPGA architecture. Unfortunately, many of the tools necessary require different techniques than traditional circuit implementation options, and these techniques are often developed specifically for only a single FPGA architecture. In this paper we describe automatic mapping tools for Triptych, an FPGA architecture with improved logic density and performance over commercial FPGAs. These tools include a simulated-annealing placement algorithm that handles the routability issues of fine-grained FPGAs, and an architecture-adaptive routing algorithm that can easily be retargeted to other FPGAs. We also describe extensions to these algorithms for mapping asynchronous circuits to Montage, the first FPGA architecture to completely support asynchronous and synchronous interface applications.

177 citations


Journal ArticleDOI
TL;DR: The Graphical Rapid Prototyping Environment (Grape-II) automates the prototyping methodology for general-purpose hardware systems by offering tools for resource estimation, partitioning, assignment, routing, scheduling, code generation, and parameter modification.
Abstract: We propose a rapid-prototyping setup to minimize development cost and a structured-prototyping methodology to reduce programming effort. The general-purpose hardware consists of commercial DSP processors, bond-out versions of core processors, and field-programmable gate arrays (FPGAs) linked to form a powerful, heterogeneous multiprocessor, such as the Paradigm RP developed within the Retides (Real-Time DSP Emulation System) Esprit project. Our Graphical Rapid Prototyping Environment (Grape-II) automates the prototyping methodology for these hardware systems by offering tools for resource estimation, partitioning, assignment, routing, scheduling, code generation, and parameter modification. Grape-II has been used successfully in three real-world DSP applications. >

160 citations


Journal ArticleDOI
TL;DR: The authors explored the utility of custom computing machinery for accelerating the development, testing, and prototyping of a diverse set of image processing applications and developed a real time image processing system called VTSplash, based on the Splash-2 general-purpose platform.
Abstract: The authors explore the utility of custom computing machinery for accelerating the development, testing, and prototyping of a diverse set of image processing applications. We chose an experimental custom computing platform called Splash-2 to investigate this approach to prototyping real time image processing designs. Custom computing platforms are emerging as a class of computers that can provide near application specific computational performance. We developed a real time image processing system called VTSplash, based on the Splash-2 general-purpose platform. Splash-2 is an attached processor featuring programmable processing elements (PEs) and communication paths. The Splash-2 system uses arrays of RAM based field programmable gate arrays (FPGAs), crossbar networks, and distributed memory to accomplish the needed flexibility and performance tasks. Such platforms let designers customize specific operations for function and size, and data paths for individual applications. >

156 citations


Patent
29 Sep 1995
TL;DR: In this article, a generalized data decompression engine is incorporated within a field programmable gate array (FPGA) for programmable logic cells within the FPGA, where a compressed configuration bit stream is received by the GDC and is decompressed thereby.
Abstract: A generalized data decompression engine is incorporated within a field programmable gate array ("FPGA"). The generalized data decompression engine uses a general purpose data decompression technique such as, for example, a Lempel-Ziv type technique. During operation, a compressed configuration bit stream is received by the generalized data decompression engine in the FPGA and is decompressed thereby. A resultant decompressed configuration bit stream is then used to program logic cells within the FPGA.

146 citations


Patent
06 Sep 1995
TL;DR: In this article, a configuration structure for a field programmable gate array (FPGA) allows a user to reconfigure or partly reconfigure the FPGA from within the fPGA, allows an addressable configuration memory to be addressed in parallel through a set of address and data or through a serial interface.
Abstract: A configuration structure for a field programmable gate array (FPGA) allows a user to reconfigure or partly reconfigure the FPGA from within the FPGA, allows an addressable configuration memory to be addressed in parallel through a set of address and data or through a serial interface. Signals such as chip-enable and other control signals can be modified by user logic so that data loaded through a serial interface pin is entered into an addressed portion of configuration memory. The configuration memory programs not only the internal circuitry accessed by the user but also a programmable switch for directing signals between external pins, configuration memory control lines, and a serial data interface. Providing both parallel and serial interfaces allows a programmable switch which is initially configured to connect its related pad or pads to configuration control lines such as a chip enable line or a serial data input line to later be configured to connect an internally generated signal or signals to the line or lines and thus override any external signal which would have been connected to that line or lines.

136 citations


Proceedings ArticleDOI
19 Apr 1995
TL;DR: A configurable custom computing engine, based on field programmable gate arrays, to enable experiments on an interesting scale, using Teramac to conduct experiments with special purpose processors involving search of nontext databases.
Abstract: The Teramac configurable hardware system can execute synchronous logic designs of up to one million gates at rates up to 1 megahertz. A fully configured Teramac includes half a gigabyte of RAM and hardware support for large multiported register files. The system has been built from custom FPGA's packaged in large multichip modules (MCMs). A large custom circuit (/spl sim/1,000,000 gates) may be compiled onto the hardware in approximately 2 hours, without user intervention. The system is being used to explore the potential of custom computing machinery (CCM).

126 citations


Book ChapterDOI
01 Sep 1995
TL;DR: Two basic implementation approaches with FPGAs: compiletime reconfiguration and run-time reconfigurement are discussed and existing applications for each strategy are discussed.
Abstract: Reconfigurable FPGAs provide designers with new implementation approaches for designing high-performance applications. This paper discusses two basic implementation approaches with FPGAs: compiletime reconfiguration and run-time reconfiguration. Compile-time reconfiguration is a static implementation strategy where each application consists of one configuration. Run-time reconfiguration is a dynamic implementation strategy where each application consists of multiple cooperating configurations. This paper introduces these strategies and discusses the implementation approaches for each strategy. Existing applications for each strategy are also discussed.

124 citations


Patent
27 Oct 1995
TL;DR: In this article, a user-programmable gate array architecture with a plurality of horizontal and vertical general interconnect channels, each including plurality of interconnect conductors some of which may be segmented, is presented.
Abstract: A user-programmable gate array architecture includes an array of logic function modules which may comprise one or more combinatorial and/or sequential logic circuits. An interconnect architecture comprising a plurality of horizontal and vertical general interconnect channels, each including a plurality of interconnect conductors some of which may be segmented, is imposed on the array. Individual ones of the interconnect conductors are connectable to each other and to the inputs and outputs of the logic function modules by user-programmable interconnect elements. A local interconnect architecture comprising local interconnect channels is also imposed on the array. Each local interconnect channel includes a plurality of local interconnect conductors and runs between pairs of adjacent ones of the logic function modules.

120 citations


Journal ArticleDOI
TL;DR: Triptych is presented, an FPGA architecture designed to achieve improved logic density with competitive performance by allowing a per-mapping tradeoff between logic and routing resources, and with a routing scheme designed to match the structure of typical circuits.
Abstract: Field-programmable gate arrays (FPGAs) are an important implementation medium for digital logic. Unfortunately, they currently suffer from poor silicon area utilization due to routing constraints. In this paper we present Triptych, an FPGA architecture designed to achieve improved logic density with competitive performance. This is done by allowing a per-mapping tradeoff between logic and routing resources, and with a routing scheme designed to match the structure of typical circuits. We show that, using manual placement, this architecture yields a logic density improvement of up to a factor of 3.5 over commercial FPGAs, with comparable performance. We also describe Montage, the first FPGA architecture to fully support asynchronous and synchronous interface circuits.

Proceedings ArticleDOI
19 Apr 1995
TL;DR: This paper presents a compiler capable of synthesizing the hardware configuration of FPGA execution units from C++ source code and shows that this approach leads to very short synthesize time as compared to VHDL synthesizer for a similar quality of the generated hardware.
Abstract: If reconfigurable processors are to become widely used, we will need tools to help conventional programmer use them. In particular, a single high-level language should be used to program the whole application; both the part which will become the hardware configuration and the part which remains software. Spyder is a reconfigurable processor with configurable execution units. The C++ language has been chosen as the source language to program this processor. In this paper we present a compiler capable of synthesizing the hardware configuration of FPGA execution units from C++ source code. The same source code can be compiled by a standard C++ compiler for simulation purposes. First estimates show that this approach leads to very short synthesize time as compared to VHDL synthesizer for a similar quality of the generated hardware.

Patent
26 May 1995
TL;DR: In this paper, the authors present an apparatus and method for decreasing the amount of time necessary to load configuration data into Field Programmable Gate Arrays (FPGAs) or other integrated circuit devices.
Abstract: An apparatus and method for decreasing the amount of time necessary to load configuration data into Field Programmable Gate Arrays (FPGAs) or other integrated circuit devices. In a preferred embodiment, serially arrayed FPGAs receive a concatenated stream of data from a common data bus. As a first FPGA reaches a loading-complete state, an enabling token is passed from the first FPGA to an enabling input on the next FPGA. The process repeats until all devices are completely loaded or fully configured.

Proceedings ArticleDOI
Stephen M. Trimberger1
01 Jan 1995
TL;DR: Routing models provided by some commercial FPGA architectures are described, and the effects of these architectures on routing algorithms are pointed out.
Abstract: Although many traditional Mask Programmed Gate Array (MPGA) algorithms can be applied to FPGA routing, FPGA architectures impose critical constraints and provide alternative views of the routing problem that allow innovative new algorithms to be applied. This paper describes routing models provided by some commercial FPGA architectures, and points out the effects of these architectures on routing algorithms. Implicit in the discussion is a commentary on current and future research in FPGA routing.

Proceedings ArticleDOI
19 Sep 1995
TL;DR: A dynamic instruction set computer (DISC) has been developed to support demand-driven instruction set modification and enhances the functional density of FPGAs by physically relocating instruction modules to available FPGA space.
Abstract: A Dynamic Instruction Set Computer (DISC) has been developed to support demand-driven instruction set mod-ification. Using partial reconfiguration, DISC pages instruction modules in and out of an FPGA as demanded bythe executing program. Instructions occupy FPGA resources only when needed and FPGA resources can be reusedto implement an arbitrary number of performance-enhancing application-specific instructions. DISC further en-hances the functional density of FPGAs by physically relocating instruction modules to available FPGA space. Animage processing application was developed on DISC to demonstrate the advantages of paging application-specificinstruction modules.Keywords: FP GA processor, run-time reconfiguration, relocatable hardware, application-specific processor 1 INTRODUCTION For many digital systems, general purpose processors do not provide sufficient processing power to operateacceptably in real-time environments. Specialized computing resources, such as digital signal processors andapplication-specific processors, are often used to improve available computation. A relatively new approach toimproving the available computing power of embedded systems is to implement application specific circuits withField Programmable Gate Arrays (FPGAs).Although more expensive than custom circuits, FPGAs provide a simplified, low-cost design environment.

Patent
Fredrick Zlotnick1
31 Jan 1995
TL;DR: In this article, a daisy chain of FPGA configuration circuits is proposed to enable the clock source from the next FPGAs to control the serial data stream while disabling the clock from the previous FPGa as each finishes loading its configuration data.
Abstract: A field programmable gate array (FPGA) configuration circuit reads configuration data from a memory (12) and converts the parallel data to a serial data stream through a shift register (16) clocked by a clock signal. A first FPGA (18) controls the serial data stream by providing the clock signal when enabled by a start signal. Once the configuration data has been completely loaded into the first FPGA, the first FPGA outputs a done signal to a second FPGA (20) to enable it's clock to control the serial data stream into the second FPGA. The clock from the first FPGA is disabled. Each FPGA passes control to the next FPGA in a daisy chain arrangement by enabling the clock source from the next FPGA while disabling the clock source from previous FPGA as each finishes loading its configuration data.

Journal ArticleDOI
TL;DR: An effective logic synthesis procedure based on parallel and serial decomposition of a Boolean function and is suitable for different types of FPGAs including XILINX, ACTEL and ALGOTRONIX devices.
Abstract: An effective logic synthesis procedure based on parallel and serial decomposition of a Boolean function is presented in this paper. The decomposition, carried out as the very first step of the .synthesis process, is based on an original representation of the function by a set of r-partitions over the set of minterms. Two different decomposition strategies, namely serial and parallel, are exploited by striking a balance between the two ideas. The presented procedure can be applied to completely or incompletely specified, single- or multiple-output functions and is suitable for different types of FPGAs including XILINX, ACTEL and ALGOTRONIX devices. The results of the benchmark experiments presented in the paper show that, in several cases, our method produces circuits of significantly reduced complexity compared to the solutions reported in the literature.

Proceedings ArticleDOI
19 Apr 1995
TL;DR: This work combines the advantages of systolic algorithms with the low cost of developing application specific designs using field programmable gate arrays (FPGAs) to build a scalable convolver for use in computer vision systems.
Abstract: Convolution is a fundamental operation in many signal and image processing applications. Since the computation and communication pattern in a convolution operation is regular, a number of special architectures have been designed and implemented for this operator. The Von Neumann architectures cannot meet the real-time requirements of applications that use convolution as an intermediate step. We combine the advantages of systolic algorithms with the low cost of developing application specific designs using field programmable gate arrays (FPGAs) to build a scalable convolver for use in computer vision systems. The performance of the systolic algorithm of (Kung et al., 1981) is compared theoretically and experimentally with many other convolution algorithms reported in the literature. The implementation of a convolution operation on Splash 2, an attached processor based on Xilinx 4010 FPGAs, is reported with impressive performance gains.

Patent
07 Jun 1995
TL;DR: In this article, an address decoder and sequencer divides the N-bit address into first, second, and third portions and employs the first and second portions interchangeably, in accordance with the second portion, for addressing respective x and y dimensions of the plurality of programmable resources for selecting an associated programmable resource to be configured.
Abstract: A field programmable gate array has a plurality of programmable resources addressable per respective x and y dimensions of an x,y two dimensional array. A memory device provides a plurality of memory units that store configuration data for configuring associated programmable resources of the field programmable gate array. A controller addresses the memory device with an N-bit address for retrieving given configuration data. An address decoder and sequencer divides the N-bit address into first, second, and third portions and employs the first and third portions interchangeably, in accordance with the second portion, for addressing respective x and y dimensions of the plurality of programmable resources for selecting an associated programmable resource to be configured in accordance with the retrieved configuration data.

Book ChapterDOI
01 Sep 1995
TL;DR: This study demonstrates that FPGAs can provide an order of magnitude better performance than DSP processors and can in many cases approach or exceed ASIC levels of performance.
Abstract: FPGAs have been proposed as high-performance alternatives to DSP processors. This paper quantitatively compares FPGA performance against DSP processors and ASICs using actual applications and existing CAD tools and devices. Performance measures were based on actual multiplier performance with FPGAs, DSP processors and ASICs. This study demonstrates that FPGAs can provide an order of magnitude better performance than DSP processors and can in many cases approach or exceed ASIC levels of performance.

Patent
24 Oct 1995
TL;DR: In this paper, a method for programming an FPGA memory includes the steps of downloading a first set of data into the FPGAs, and then downloading a second set of information into the memory, the second set being of a type for programming the FFPA to perform one or more logic functions.
Abstract: A programmable logic circuit includes a non-volatile, in-circuit programmable FPGA memory containing configuration data for programming an FPGA to perform one or more desired logic functions. The chips in the circuit may be packaged individually or may be mounted in die form on a multichip module. A method for programming an FPGA memory includes the steps of downloading a first set of data into an FPGA to program the FPGA to function as an FPGA memory programmer, and then downloading a second set of data into the FPGA, the second set of data being of a type for programming the FPGA to perform one or more logic functions. The second set of data, however, is not immediately loaded into the FPGA but instead is programmed into the FPGA memory in a manner controlled by the programming function of the FPGA. A method for programming an FPGA includes as a step the aforementioned method. A multichip module contains an FPGA die and FPGA memory die, either of which may be programmed in accordance with the methods described above.

Proceedings ArticleDOI
TL;DR: According to the evaluation results based on an FPGA implementation, hardware portion of these functionalities can be executed within 250 ns and the task scheduling can be performed within 750 ns simultaneously, which are about 6 to 50 times faster than software implementation.
Abstract: This paper proposes a new approach to realize a very high performance real-time OS using VLSI technology In this method, quick and steady response can be guaranteed by implementing basic operations of a real-time OS as a peripheral chip (Silicon TRON) to be connected to general purpose microprocessors In order to confirm the effectiveness of this method, most basic system calls of /spl mu/ITRON have been designed using an HDL Synthesis results using a 08 /spl mu/m CMOS technology show that most important part of the system calls can be realized as a VLSI chip According to the evaluation results based on an FPGA implementation, hardware portion of these functionalities can be executed within 250 ns and the task scheduling can be performed within 750 ns simultaneously, which are about 6 to 50 times faster than software implementation Accordingly, very high performance real-time systems can be realized by the proposed method

Proceedings ArticleDOI
19 Apr 1995
TL;DR: This paper evaluates the feasibility of reconfiguring an FPGA at run time, and tests its performance using a "Grand Challenge Problem", the high speed scanning of genomic sequence databases, to show an improvement in speed of two to three orders of magnitude.
Abstract: This paper evaluates the feasibility of reconfiguring an FPGA at run time, and tests its performance using a "Grand Challenge Problem", the high speed scanning of genomic sequence databases. Algorithm implementation into a XC3090 FPGA is described, and methods proposed for generating a placed Xilinx Netlist File that can be efficiently routed at run time by the Automated Placing and Routing Xilinx tools, in order to increase the speed and the density of the design. The same algorithm carefully optimised on a RISC processor has been compared with the run time reconfigurated FPGA, and shows the latter to have an improvement in speed of two to three orders of magnitude.

Patent
16 Oct 1995
TL;DR: In this article, a field programmable gate array (FPGA) is equipped with two separate buffers to drive the output lines of a CLB for handling critical path situations, one buffer drives the horizontal interconnect line, while the other drives the vertical interconnection line.
Abstract: A field programmable gate array having independently buffered output lines of a CLB for handling critical path situations. One of the CLB's output ports is coupled to a vertical interconnect line and a horizontal interconnect line. Two separate buffers are used to drive these lines. One buffer drives the horizontal interconnect line, while the other drives the vertical interconnect line. One of these lines is used to conduct the output signal that corresponds to the critical path. The other line is used to conduct the output signal onto other branches that are not part of the critical path. Hence, by using a separate buffer to drive the critical path, it is not loaded with the circuits associated with the non-critical branches.

Proceedings ArticleDOI
04 Mar 1995
TL;DR: A new three-dimensional (3D) FPGA architecture is proposed, along with a fabrication methodology, and several physical-design issues in the new 3D paradigm are raised.
Abstract: Motivated by improving FPGA performance, we propose a new three-dimensional (3D) FPGA architecture, along with a fabrication methodology. We analyze the expected manufacturing yield, and raise several physical-design issues in the new 3D paradigm. Our techniques also have good implications for resource utilization, physical size, and power consumption.

Patent
23 Aug 1995
TL;DR: In this article, the critical path of an FPGA configuration is optimized by rerouting connections between the logical primitives of a critical path and the FPGAs configuration.
Abstract: In a Field Programmable Gate Array ("FPGA") design system, a configuration is generated. A path of the configuration is selected as a critical path for optimization. The critical path is optimized by rerouting connections between the logical primitives of the critical path. Prior to the rerouting, the logical primitives of the critical path may be optimally placed within the FPGA configuration. Optimal performance of the critical path is thus achieved.

Patent
06 Jun 1995
TL;DR: In this article, a register protect circuit controllably protects the contects of these user logic registers from being modified by signals from the user's logic, and allows these registers to be written by a microprocessor through the configuration memory addressing structure.
Abstract: In an FPGA having registers which are part of a user's logic functions and a configuration memory which is read and written through an addressing structure, a register protect circuit controllably protects the contects of these user logic registers from being modified by signals from the user's logic, allows these registers to be written by a microprocessor through the configuration memory addressing structure, and allows both the user's registers and lines which provide combinational signals to be read by a microprocessor through the configuration memory addressing structure.

Book ChapterDOI
02 Oct 1995
TL;DR: Evolvable Hardware is implemented on a PLD(Programmable Logic Device)-like device whose architecture can be altered by re-programming the architecture bits.
Abstract: This paper describes Evolvable Hardware (EHW) and its applications to pattern recognition and fault-torelant systems. EHW can change its own hardware structure to adapt to the environment whenever environmental changes (including hardware malfunction) occur. EHW is implemented on a PLD(Programmable Logic Device)-like device whose architecture can be altered by re-programming the architecture bits. Through genetic algorithms, EHW finds the architecture bits which adapt best to the environment, and changes its hardware structure accordingly.

Proceedings ArticleDOI
15 Feb 1995
TL;DR: A field-programmable analog array (FPAA) for prototyping continuous-time analog circuits is reported here, which offers simplified analog circuit design with the advantages of instant prototyping, programmable topology,programmable parameters, CAD compatibility, and testability.
Abstract: Field-programmable gate arrays for prototyping digital circuits are a widely endorsed approach for reducing time-to-market. Offering similar advantages, a field-programmable analog array (FPAA) for prototyping continuous-time analog circuits is reported here. Conceptually, a FPAA consists of configurable analog blocks (CABs) and interconnects. The function of each CAB and the connections among CABs are determined by the contents of an on-chip shift register. Different circuits can be instantiated using a FPAA by loading in different configuration bits. This IC strategy offers simplified analog circuit design with the advantages of instant prototyping, programmable topology, programmable parameters, CAD compatibility, and testability.

Patent
Stephen M. Trimberger1
02 Jun 1995
TL;DR: In this paper, a microprocessor controlled device is provided which appears to a user to be a programmable logic device, and signals are taken from and placed on external pins in the same manner as would be done with a prior-art programmable device.
Abstract: In accordance with the present invention, a microprocessor controlled device is provided which appears to a user to be a programmable logic device. Signals are taken from and placed on external pins in the same manner as would be done with a prior art programmable logic device. However, internal hardware which would be provided in a programmable logic device for performing the logic function is replaced by a microprocessor with associated memory. The microprocessor is programmable to read input signals from input pins, perform calculations related to the desired logic, and place signals onto output pins. Thus the function of the microprocessor controlled device as it appears from observing signals on external pins is the same as that of a prior art FPGA or other logic device. However, internally, a program which has been stored in the memory associated with the microprocessor causes the microprocessor to serially read signals from external pins, perform the necessary calculations, and place signals onto output pins. Multiple microprocessors in the same logic device can also be provided.