scispace - formally typeset
Search or ask a question

Showing papers on "Field-programmable gate array published in 1992"


Patent
11 Dec 1992
TL;DR: In this article, an integrated circuit computing device is comprised of a dynamically configurable Field Programmable Gate Array (FPGA), which is configured to implement a RISC processor and a Reconfigurable Instruction Execution Unit.
Abstract: An integrated circuit computing device is comprised of a dynamically configurable Field Programmable Gate Array (FPGA). This gate array is configured to implement a RISC processor and a Reconfigurable Instruction Execution Unit. Since the FPGA can be dynamically reconfigured, the Reconfigurable Instruction Execution Unit can be dynamically changed to implement complex operations in hardware rather than in time-consuming software routines. This feature allows the computing device to operate at speeds that are orders of magnitude greater than traditional RISC or CISC counterparts. In addition, the programmability of the computing device makes it very flexible and hence, ideally suited to handle a large number of very complex and different applications.

346 citations


Patent
18 Sep 1992
TL;DR: In this paper, an improved electronic design automation (EDA) system employs field programmable gate arrays (FPGAs) for emulating prototype circuit designs, and a circuit netlist file is downloaded to the FPGAs to configure the FGAs to emulate a functional representation of the prototype circuit.
Abstract: An improved electronic design automation (EDA) system employs field programmable gate arrays (FPGAs) for emulating prototype circuit designs. A circuit netlist file is down-loaded to the FPGAs to configure the FPGAs to emulate a functional representation of the prototype circuit. To check whether the circuit netlist is implemented properly, the FPGAs are tested functionally by applying input vectors thereto and comparing the resulting output of the FPGAs to output vectors provided from prior simulation. If the FPGAs fail such vector comparison, the FPGAs are debugged by inserting "read-back" trigger instructions in the input vectors, preferably corresponding to fail points in the applied vector stream. Modifying the input vectors with such read-back signals causes the internal states of latches and flip-flops in each FPGA to be captured when functional testing is repeated. Such internal state information is useful for debugging the FPGAs, and particularly convenient because no recompilation of the circuit netlist is required. A similar approach which also uses the read-back feature of FPGAs is employed to debug FPGAs coupled to a target system which appears to fail during emulation runs.

234 citations


Book ChapterDOI
11 Nov 1992
TL;DR: In this paper, the authors present some quantitative performance measurements for the computing power of Programmable Active Memories (PAM), based on Field Programmable Gate Array (FPGA) technology, which is a universal hardware co-processor closely coupled to a standard host computer.
Abstract: We present some quantitative performance measurements for the computing power of Programmable Active Memories (PAM), as introduced by [BRV 89]. Based on Field Programmable Gate Array (FPGA) technology, the PAM is a universal hardware co-processor closely coupled to a standard host computer. The PAM can speed up many critical software applications running on the host, by executing part of the computations through a specific hardware design. The performance measurements presented are based on two PAM architectures and ten specific applications, drawn from arithmetics, algebra, geometry, physics, biology, audio and video. Each of these PAM designs proves as fast as any reported hardware or super-computer for the corresponding application. In cases where we could bring some genuine algorithmic innovation into the design process, the PAM has proved an order of magnitude faster than any previously existing system (see [SBV 91] and [S 92]).

194 citations


Journal ArticleDOI
C.E. Cox1, W.E. Blanz1
TL;DR: The authors take advantage of the reprogrammability of the devices to automatically generate new custom hardware for each application of the classifier, which is a totally digital connectionist classifier.
Abstract: 71 The architecture, implementation, and application of GANGLION, a totally digital connectionist classifier, are described. This fully interconnected feedforward net with one hidden layer is capable of generating 4.48 billion interconnection/s. The architecture is realized on a single 9U VME card and is built entirely from off-the-shelf components. The very high throughput of 20 million decision/s is achieved by making efficient use of field-programmable gate arrays. Specifically, the authors take advantage of the reprogrammability of the devices to automatically generate new custom hardware for each application of the classifier. >

139 citations


Book
30 Jun 1992
TL;DR: The introduction to FPGAs and a theoretical model for FPGA Routing, as well as some of the technologies used in that model, are described.
Abstract: Preface. Glossary. 1. Introduction to FPGAs. 2. Commercially Available FPGAs. 3. Technology Mapping for FPGAs. 4. Logic Block Architecture. 5. Routing for FPGAs. 6. Flexibility of FPGA Routing Architectures. 7. A Theoretical Model for FPGA Routing. References. Index.

129 citations


Proceedings ArticleDOI
11 Oct 1992
TL;DR: The Realizer, a system which automatically configures a network of field-programmable gate arrays (FPGAs) to implement large digital logic designs, is presented, and the interconnection architecture, called the partial crossbar, greatly reduces system-level placement and routing complexity.
Abstract: The Realizer, a system which automatically configures a network of field-programmable gate arrays (FPGAs) to implement large digital logic designs, is presented. Logic and interconnect are separated to achieve optimum FPGA utilization. The interconnection architecture, called the partial crossbar, greatly reduces system-level placement and routing complexity, achieves bounded interconnect delay, scales linearly with pin count, and allows hierarchical expansion to systems with hundreds or thousands of FPGA devices in a fast and uniform way. An actual multiboard system has been built, using 42 XC3090 FPGAs for logic. A 32-b CPU datapath has been automatically realized and operated at speed, and demonstrates very good FPGA utilization. >

122 citations


Journal ArticleDOI
TL;DR: While the results depend on the delay of the programmable routing, experiments indicate that five- and six-input lookup tables and certain multiplexer configurations produce the lowest total delay over realistic values of routing delay.
Abstract: This authors explore the effect of logic block architecture on the speed of a field-programmable gate array (FPGA). Four classes of logic block architecture are investigated: NAND gates, multiplexer configurations, lookup tables, and wide-input AND-OR gates. An experimental approach is taken, in which each of a set of benchmark logic circuits is synthesized into FPGAs that use different logic blocks. The speed of the resulting FPGA implementations using each logic block is measured. While the results depend on the delay of the programmable routing, experiments indicate that five- and six-input lookup tables and certain multiplexer configurations produce the lowest total delay over realistic values of routing delay. The fine grain blocks, such as the two-input NAND gate, exhibit poor performance because these gates require many levels of logic block to implement the circuits and hence require a large routing delay. >

99 citations


Journal ArticleDOI
TL;DR: AnyBoard, a low-cost, field programmable gate array (FPGA)-based, reconfigurable rapid-prototyping system is described and the implementation of a pattern generator design is presented to illustrate the system's effectiveness.
Abstract: AnyBoard, a low-cost, field programmable gate array (FPGA)-based, reconfigurable rapid-prototyping system is described. The system hardware organization and software tools that help users automatically map designs to the FPGAs and manage the design process are discussed. The implementation of a pattern generator design is presented to illustrate the system's effectiveness. >

87 citations


Proceedings ArticleDOI
Chiang1, Forouhi1, Chen1, Hawley1, McCollum1, Hamdy1, Hu2 
01 Jan 1992
TL;DR: In this paper, the authors discuss the characteristics of various antifuse structures and tradeoffs between performance and reliability are also discussed, and discuss the tradeoff between reliability and performance in field programmable gate array devices.
Abstract: Antifuse structure as a programming element has become increasingly popular in field programmable gate array devices. In this paper we discuss the characteristics of various antifuse structures. Tradeoffs between performance and reliability are also discussed. >

69 citations


Book ChapterDOI
31 Aug 1992
TL;DR: An implementation of a novel systolic array for sequence alignment on the SPLASH reconfigurable logic array that performs several orders of magnitude faster than implementation on supercomputers.
Abstract: This paper describes an implementation of a novel systolic array for sequence alignment on the SPLASH reconfigurable logic array. The systolic array operates in two phases. In the first phase, a sequence comparison array due to Lopresti [1] is used to compute a matrix of distances which is stored in local RAM. In the second phase, the stored distances are used by the alignment array to produce a binary encoding of the sequence alignment. Preliminary benchmarks show that the SPLASH implementation performs several orders of magnitude faster than implementation on supercomputers.

68 citations


Proceedings ArticleDOI
11 Oct 1992
TL;DR: A routing-driven technology mapper for lookup-table, (LUT)-based field-programmable gate arrays (FPGAs) that can handle both combinational and sequential logic circuits, and has been implemented for combinational circuits.
Abstract: A routing-driven technology mapper for lookup-table, (LUT)-based field-programmable gate arrays (FPGAs) is presented. The approach is based on performing mapping aimed at routing feasibility. For an FPGA of given size (number of LUTs), the logic being implemented is distributed in such a manner that the total wire length is minimized and the routing resources are not overutilized. Simulated annealing is used to perform mapping, placement, and global routing in tandem. The algorithm can handle both combinational and sequential logic circuits, and has been implemented for combinational circuits. Experiments on MCNC benchmark circuits show encouraging results. >

Book ChapterDOI
31 Aug 1992
TL;DR: This paper describes Montage, a Triptych-based FPGA designed for implementing asynchronous logic and interfacing separately-clocked synchronous circuits, and demonstrates how the Montage FPGAs satisfies the demands of these classes of circuits.
Abstract: Field-programmable gate arrays are frequently used to implement system interfaces and glue logic. However, there has been little attention given to the special problems of these types of circuits in FPGA architectures. In this paper we describe Montage, a Triptych-based FPGA designed for implementing asynchronous logic and interfacing separately-clocked synchronous circuits. Asynchronous circuits have different requirements than synchronous circuits, which make standard FPGAs unusable for asynchronous applications. At the same time, many asynchronous design methodologies allow components with greatly different performance to be substituted for one another, making a design environment which migrates between FPGA, MPGA, and semi-custom implementations very attractive. Similar problems also exist for interfacing separately-clocked synchronous circuits. We discuss these problems, and demonstrate how the Montage FPGA satisfies the demands of these classes of circuits.

Patent
F. Erich Goetting1
23 Jul 1992
TL;DR: In this paper, a configuration control unit (CCU) is used in an FPGA device having programmable logic cells and a programmable interconnect array, where the logic cells are programmed using transistors controlled by memory cells and the interconnect structure is programmed using antifuses.
Abstract: The present invention is used in an FPGA device having programmable logic cells and a programmable interconnect array. In a preferred embodiment in which the logic cells are programmed using transistors controlled by memory cells and the interconnect structure is programmed using antifuses, a configuration control unit (CCU) of the present invention can accomplish three functions: 1) applying programming voltages to terminals of the interconnect antifuses; 2) configuring the logic cells; and 3) reading status of signals on the interconnect structure. The CCUs are connected together into a shift register. Each CCU connects to a horizontal or vertical interconnect line. At intersections of these interconnect lines are antifuses. By loading logical 1's into the two CCUs, it is possible to address the antifuse at the intersection of the two interconnect lines. A voltage difference can then be directed to the two terminals of that antifuse for programming the antifuse. After antifuses are programmed, configuration information is shifted into the CCUs to configure the logic cells. These same CCUs can be used to capture the logical states of each of the interconnect lines, each CCU capturing one signal present on an interconnect line to which that CCU connects. The captured data can then be shifted out through the shift register.

Proceedings ArticleDOI
S. Trimberger1, M.-R. Chene1
11 Oct 1992
TL;DR: The authors describe a partitioning method that includes both logic-based and placement-based steps to achieve a high-quality legal partitioning and simultaneously generates an initial placement for the design.
Abstract: Lookup-table-based field-programmable gate array (FPGA) logic blocks contain multiple lookup-tables, flip flops, and other features. The partitioning of this logic into physical blocks has a logical component, traditionally handled as part of technology mapping in logic synthesis, and a physical component, traditionally handled by placement in physical design. However, methods that use a purely logical partitioning give designs that are difficult to route, and methods that use a purely physical partitioning do not result in legal logical blocks. The authors describe a partitioning method that includes both logic-based and placement-based steps to achieve a high-quality legal partitioning. The method simultaneously generates an initial placement for the design. >


Proceedings ArticleDOI
01 Jul 1992
TL;DR: A new approach to technology mapping for area and delay for truth-table-based field programmable gate arrays is presented as a case of clique partitioning for which an efficient heuristic was developed.
Abstract: The authors present a new approach to technology mapping for area and delay for truth-table-based field programmable gate arrays. They view the area and delay optimizations during technology mapping as a case of clique partitioning for which an efficient heuristic was developed. Alternate decompositions were explored by using Shannon expansion. Experimental results are included that were obtained by this approach for area and delay optimization on a number of benchmark examples. >

Book ChapterDOI
Beat Heeb1, Cuno Pfister1
31 Aug 1992
TL;DR: Chameleon is an experimental workstation based on a RISC processor that radically relies on FPGAs for all input/output functions and serves as a means to probe the limits of FPGA usage while at the same time being the development system for its own FGPA circuits.
Abstract: Chameleon is an experimental workstation based on a RISC processor. It provides unprecedented flexibility and speed for certain applications due to the use of RAM-configurable Field Programmable Gate Arrays (FPGAs). FPGAs are used to replace glue logic as well as to provide a non-dedicated computation resource. This resource can be regarded as a general purpose coprocessor which can be reconfigured and thus transformed into a special purpose coprocessor in milliseconds at run-time. The coprocessor can be used both for handling complex input/output functions as well as to replace time-critical inner loops of user programs running on the central processing unit. Chameleon radically relies on FPGAs for all input/output functions. It serves as a means to probe the limits of FPGA usage while at the same time being the development system for its own FPGA circuits.

Proceedings ArticleDOI
04 Jan 1992
TL;DR: Four classes of architectures are discussed for connecting an array of FPGA chips and also logic blocks in a chip to achieve functionality and performance comparable to that of ASIC designs using standard cells and mask programmable gate arrays and at a low cost and reduced development time.
Abstract: Rapid prototyping and simulation of digital systems using mays of field programmable gate array (FPGA) chips is expected to be an important aspect of digital design and implementation. Four classes of architectures are discussed for connecting an array of FPGA chips and also logic blocks in a chip to achieve functionality and performance comparable to that of ASIC designs using standard cells and mask programmable gate arrays and at a low cost and reduced development time. The four classes of architectures use technologies such as antifuse, SRAM, Electrically Erarable PROM (EEPROM), and (Ultraviolet) Erasable PROM (EPROM) to control and configure interconnection structures between the logic blocks. The software requirements for these classes of architectures and potential performance are discussed. The impact of these architectures on commercial products is also described.

Proceedings ArticleDOI
Kukimoto1, Fujita1
01 Jan 1992
TL;DR: In this paper, a method to rectify lookup-table-type field-programmable gate array (FPGA) designs is presented, where only the functionality realized by lookup tables in a chip is modified and the netlist is retained so that there is no change in the delay of the chip.
Abstract: A method to rectify lookup-table-type field-programmable gate array (FPGA) designs is presented. Instead of changing the netlist, only the functionality realized by lookup tables in a chip is modified and the netlist is retained so that there is no change in the delay of the chip. The problem is formalized using characteristic functions, and a redesign technique based on Boolean relations is presented. >

Proceedings ArticleDOI
Francis1
01 Jan 1992
TL;DR: In this paper, the authors discuss combinational logic synthesis for FPGAs that use lookup tables (LUTs), and discuss the issues that differentiate LUT synthesis from conventional logic synthesis.
Abstract: Discusses combinational logic synthesis for FPGAs that use lookup tables (LUTs). Issues that differentiate LUT synthesis from conventional logic synthesis are emphasized. The ability of a K-input LUT to implement any Boolean function of K variables differentiates the synthesis of LUT circuits from that for conventional ASIC technologies. The major different occurs during the technology mapping phase of logic synthesis. For values of K greater than 3, the larger number of functions that can be implemented by a K-input LUT makes it impractical to use a conventional library-based technology mapping. However, the completeness of the set of functions that can be implemented by a LUT eliminates the need for a library of separate functions. In addition, this completeness can be leveraged to optimize the final circuit. >

Proceedings ArticleDOI
08 Mar 1992
TL;DR: A hardware implementation of fuzzy controllers on field programmable gate arrays (FPGAs) is described and software for synthesizing fuzzy controllers into Boolean equations was developed, providing a complete design automation tool for fuzzy controllers.
Abstract: A hardware implementation of fuzzy controllers on field programmable gate arrays (FPGAs) is described. FPGAs are semicustom integrated circuits that combine the attractive features of both programmable logic devices and gate arrays. Software for synthesizing fuzzy controllers into Boolean equations was developed. The file that contains the set of Boolean equations is accepted directly by the development system of the FPGA. The development system then produces the necessary code for programming the FPGA chip. The speed of the fuzzy controller is determined by the response time of the FPGA circuit that realizes the Boolean equations. A speed of 50M FLIPS was achieved. The software together with the FPGA development system provide a complete design automation tool for fuzzy controllers. >

Proceedings ArticleDOI
08 Nov 1992
TL;DR: A method to rectify lookup-table-type field-programmable gate array (FPGA) designs is presented, where only the functionality realized by lookup tables in a chip is modified and the netlist is retained so that there is no change in the delay of the chip.
Abstract: A method to rectify lookup-table-type field-programmable gate array (FPGA) designs is presented. Instead of changing the netlist, only the functionality realized by lookup tables in a chip is modified and the netlist is retained so that there is no change in the delay of the chip. The problem is formalized using characteristic functions, and a redesign technique based on Boolean relations is presented.<>

Proceedings ArticleDOI
11 Oct 1992
TL;DR: A graph-based technology mapping algorithm, called DAG-Map, for delay optimization in lookup-table-based field programmable gate array (FPGA) designs is presented and a graph-matching-based technique is used as a postprocessing step which optimizes the area without increasing the delay.
Abstract: A graph-based technology mapping algorithm, called DAG-Map, for delay optimization in lookup-table-based field programmable gate array (FPGA) designs is presented. The algorithm carries out technology mapping and delay optimization on the entire Boolean network, instead of decomposing it into fanout-free trees and mapping each tree separately as in most previous algorithms. As a preprocessing step, a general algorithm that transforms an arbitrary n-input network into a two-input network with only O(1) factor increase in the network depth is introduced. Also presented is a graph-matching-based technique used as a postprocessing step which optimizes the area without increasing the delay. The DAG-Map algorithm was tested on the MCNC logic synthesis benchmarks. Compared with previous algorithms, it reduces both the network depth and the number of lookup-tables. >

Proceedings ArticleDOI
11 Oct 1992
TL;DR: The interactions between the CAD tools that are used to configure the routing resources of a field-programmable gate array (FPGA) and the design of the routing architecture itself are examined and it is demonstrated that the fewest routing switches are required when each logical pin appears on only one side of the logic cell rather than two or more.
Abstract: The interactions between the CAD tools that are used to configure the routing resources of a field-programmable gate array (FPGA) and the design of the routing architecture itself are examined. Such an understanding is used to determine where to reduce the number of routing switches in the FPGA while maintaining routability. Experiments are used to study a switch block that was previously thought to have unacceptably low flexibility. It is shown that the performance of this switch block can be improved by adapting the global router to require less flexibility in the architecture, and by careful placement of physical pins on the logic blocks. It is demonstrated that the fewest routing switches are required when each logical pin appears on only one side of the logic cell rather than two or more. >

01 Jan 1992
TL;DR: Triptych is described, a new FPGA architecture that blends logic and routing resources to achieve efficient implementation of a wide range of circuits in both area and speed and a new method for architectural comparison of FPGAs that is free of irrelevant implementation effects is developed.
Abstract: We describe Triptych, a new FPGA architecture, that blends logic and routing resources to achieve efficient implementation of a wide range of circuits in both area and speed. The physical structure of Triptych attempts to match the structure of factored logic functions, thus providing an efficient substrate in which to implement these circuits. This approach both requires and takes advantage of an integrated approach to the mapping, placement and routing process. We first describe the Triptych architecture in detail. This is followed by the development of a new method for architectural comparison of FPGAs that is free of irrelevant implementation effects. Then the Triptych, Xilinx, Algotronix, and Concurrent Logic architectures are compared using this method to obtain normalized area and performance figures for a wide range of circuits, including both datapath elements and control logic. Our results indicate that Triptych is more area-efficient (Xilinx mappings average 3.5 times larger than Triptych mappings) and has at least comparable delay characteristics.

Proceedings ArticleDOI
11 Oct 1992
TL;DR: One approach to self-timed design is explored and implementations of an example circuit in three different technologies are described, representing a wide range of price and performance characteristics.
Abstract: The authors explore one approach to self-timed design and describe implementations of an example circuit in three different technologies. The simple routing chip, used as the example has been described by writing a program in OCCAM, translated into a circuit consisting of a small set of basic modules, and implemented using Actel FPGA (field-programmable gate array), CMOS, and GaAs technologies. These technologies represent a wide range of price and performance characteristics. >

Proceedings ArticleDOI
01 Jul 1992
TL;DR: TEMPT is a technology mapping algorithm aimed at exploring field-programmable gate array (FPGA) architectures with hard-wired connections that is as effective as the Xilinx 4000 CLB mapper, PPR, when minimizing CLBs to implement a set of MCNC benchmarks.
Abstract: TEMPT is a technology mapping algorithm aimed at exploring field-programmable gate array (FPGA) architectures with hard-wired connections. TEMPT maps a network of basic blocks to a netlist of hard-wired logic blocks (HLBs), in which each HLB consists of several basic hard-wire blocks connected in an arbitrary tree topology, and optimizes either speed or area. TEMPT is as effective as the Xilinx 4000 CLB mapper, PPR, when minimizing CLBs to implement a set of MCNC benchmarks. Using TEMPT it was shown empirically that many HLBs were significantly faster than FPGAs without hard-wired links. Several HLBs were demonstrated that exhibited superior logic density to the Xilinx 4000 CLB. >

Book ChapterDOI
31 Aug 1992
TL;DR: It is shown that this machine can be programmed by translating a subset of the Occam language into asynchronous modules, and a new method of formally verifying asynchronous modules for these circuits is presented using the Circal process algebra.
Abstract: The SPACE machine is introduced as a new type of computer architecture, capable of very fast simulation of highly concurrent systems. The machine is designed to be scalable, constructed from a vast array of boards. The decisions made in the the design of the board are discussed, and the actual hardware (based on an array of Field Programmable Gate Array chips) is described. It is shown that this machine can be programmed by translating a subset of the Occam language into asynchronous modules. Using the Circal process algebra, a new method of formally verifying asynchronous modules for these circuits is presented. This method allows bounded gate delays to be included in a two-level modelling mechanism.

Proceedings Article
31 Aug 1992
TL;DR: Two programs are presented, exact and approximate, for the minimization of Permuted Reed-Muller Trees that are obtained by repetitive application of Davio expansions (Shannon expansions for EXOR gates) in all possible orders of variables in subtrees.
Abstract: The new family of Field Programmable Gate Arrays, CLI6000 from Concurrent Logic Inc realizes the truly Cellular Logic. It has been mainly designed for the realization of data path architectures. However, introduced by it new universal logic cell calls also for new logic synthesis methods based on regularity of connections. In this paper we present two programs, exact and approximate, for the minimization of Permuted Reed-Muller Trees that are obtained by repetitive application of Davio expansions (Shannon expansions for EXOR gates) in all possible orders of variables in subtrees. Such trees are particularly well matched to both the realization of logic cell and connection structure of the CLI6000 device. It is shown on several standard benchmarks that the heuristic algorithm gives good quality results in much less time than the exact algorithm.

20 Feb 1992
TL;DR: In this article, a mixed technology ASIC is used to implement high-frequency, current-regulated PWM in a digital format with an on-chip analogue interface, which brings advantages over its all-analogue pre-cursor in terms of reduced calibration, better repeatability, precisely controllable modes of operation and lower component costs.
Abstract: Reductions in the size of power electronic equipment brought about by improved devices and packaging technology can now be matched even at moderate sales volumes by smaller control circuits employing ASICs. A digital ASIC is used to replace a microprocessor in a sinusoidal PWM system. An encoded look-up table is used to greatly reduce the silicon area required and hence reduce costs. A mixed technology ASIC is used to implement high-frequency, current-regulated PWM in a digital format with an on-chip analogue interface. This brings advantages over its all-analogue pre-cursor in terms of reduced calibration, better repeatability, precisely controllable modes of operation and lower component costs. Field-programmable gate-arrays were used for proving design concepts and testing chip performance in the complete system before committing designs to silicon, thereby saving time and costs during development and allowing greater opportunities for evaluation of the designs. >