scispace - formally typeset
Search or ask a question

Showing papers on "Field-programmable gate array published in 1993"


Journal ArticleDOI
01 Jul 1993
TL;DR: A survey of field-programmable gate array (FPGA) architectures and the programming technologies used to customize them is presented and a classification of logic blocks based on their granularity is proposed, and several logic blocks used in commercially available FPGAs are described.
Abstract: A survey of field-programmable gate array (FPGA) architectures and the programming technologies used to customize them is presented. Programming technologies are compared on the basis of their volatility, size parasitic capacitance, resistance, and process technology complexity. FPGA architectures are divided into two constituents: logic block architectures and routing architectures. A classification of logic blocks based on their granularity is proposed, and several logic blocks used in commercially available FPGAs are described. A brief review of recent results on the effect of logic block granularity on logic density and performance of an FPGA is then presented. Several commercial routing architectures are described in the context of a general routing architecture model. Finally, recent results on the tradeoff between the flexibility of an FPGA routing architecture, its routability, and its density are reviewed. >

362 citations


Journal ArticleDOI
TL;DR: A behavioral model of a class of mixed hardware-software systems is presented and a codesign methodology for such systems is defined.
Abstract: A behavioral model of a class of mixed hardware-software systems is presented. A codesign methodology for such systems is defined. The methodology includes hardware-software partitioning, behavioral synthesis, software compilation, and demonstration on a testbed consisting of a commercial central processing unit (CPU), field-programmable gate arrays, and programmable interconnections. Design examples that illustrate how certain characteristics of system behavior and constraints suggest hardware or software implementation are presented. >

280 citations


Patent
02 Apr 1993
TL;DR: In this article, a compilation technique overcomes device pin limitations using virtual interconnections is presented, by intelligently multiplexing each physical wire among multiple logical wires and pipelining these connections at the maximum clocking frequency.
Abstract: A compilation technique overcomes device pin limitations using virtual interconnections. Virtual interconnections overcome pin limitations by intelligently multiplexing each physical wire among multiple logical wires and pipelining these connections at the maximum clocking frequency. Virtual interconnections increase usable bandwidth and relax the absolute limits imposed on gate utilization in logic emulation systems employing Field Programmable Gate Arrays (FPGAs). A "softwire" compiler utilizes static routing and relies on minimal hardware support. The technique can be applied to any topology and FPGA device.

263 citations


Patent
Thomas A. Kean1
05 Nov 1993
TL;DR: In this paper, a hierarchical routing structure for field programmable gate arrays (FPGA) is presented. And select units for addressing memory bits can be addressed both individually and in large and arbitrary groups.
Abstract: An field programmable gate array (FPGA) of cells arranged in rows and columns is interconnected by a hierarchical routing structure. Switches separate the cells into blocks and into blocks of blocks with routing lines interconnecting the switches to form the hierarchy. Also, select units for allowing memory bits to be addressed both individually and in large and arbitrary groups are disclosed. Further a control store for configuring the FPGA is addressed as an SRAM and can be dynamically reconfigured during operation.

252 citations


Proceedings ArticleDOI
01 Nov 1993
TL;DR: Results from compiling netlists indicate that virtual wires can increase FPGA gate utilization beyond 80 percent without a significant slowdown in emulation speed.
Abstract: Existing FPGA-based logic emulators only use a fraction of potential communication bandwidth because they dedicate each FPGA pin (physical wire) to a single emulated signal (logical wire). Virtual wires overcome pin limitations by intelligently multiplexing each physical wire among multiple logical wires and pipelining these connections at the maximum clocking frequency of the FPGA. A virtual wire represents a connection from a logical output on one FPGA to a logical input on another FPGA. Virtual wires not only increase usable bandwidth, but also relax the absolute limits imposed on gate utilization. The resulting improvement in bandwidth reduces the need for global interconnect, allowing effective use of low dimension inter-chip connections (such as nearest-neighbor). Nearest-neighbor topologies, coupled with the ability of virtual wires to overlap communication with computation, can even improve emulation speeds. The authors present the concept of virtual wires and describe their first implementation, a 'softwire' compiler which utilizes static routing and relies on minimal hardware support. Results from compiling netlists for the 18 K gate Sparcle microprocessor and the 86 K gate Alewife Communications and Cache Controller indicate that virtual wires can increase FPGA gate utilization beyond 80 percent without a significant slowdown in emulation speed. >

206 citations


Proceedings ArticleDOI
05 Apr 1993
TL;DR: The architecture and compiler for a general-purpose metamorphic computing platform called PRISM-II, which improves the performance of many computationally-intensive tasks by augmenting the functionality of the core processor with new instructions that match the characteristics of targeted applications.
Abstract: This paper discusses the architecture and compiler for a general-purpose metamorphic computing platform called PRISM-II. PRISM-II improves the performance of many computationally-intensive tasks by augmenting the functionality of the core processor with new instructions that match the characteristics of targeted applications. In essence, PRISM (processor reconfiguration through instruction set metamorphosis) is a general purpose hardware platform that behaves like an application-specific platform. Two methods for hardware synthesis, one using VHDL Designer and the other using X-BLOX, are presented and synthesis results are compared. >

164 citations


Book ChapterDOI
01 Jan 1993
TL;DR: This paper suggests there are at least two approaches to be taken to build Darwin Machines, and suggests the first approach uses “software configurable hardware” chips, e.g. FPGAs, HDPLDs, or possibly a new generation of chips based on the ideas that FPGA etc embody.
Abstract: For the past three years, the author has been dreaming of the possibility of building machines which are capable of evolution, called “Darwin Machines”. As a result of several brain storming sessions with some colleagues in electrical engineering, the author now realizes that hardware devices are on the market today, which use “software configurable hardware” technologies that the author believes can be used to build Darwin Machines within a year or two. This paper suggests there are at least two approaches to be taken. The first approach uses “software configurable hardware” chips, e.g. FPGAs (Field Programmable Gate Arrays), HDPLDs (High Density Programmable Logic Devices), or possibly a new generation of chips based on the ideas that FPGAs etc embody. The second approach uses a special hardware device called a “hardware accelerator” which accelerates the simulation in software of digital hardware devices containing up to several hundred thousand gates. Darwin Machines will be essential if artificial nervous systems are to be evolved for biots (i.e. biological robots) which consist of thousands of evolved neural network modules (called GenNets). The evolution time of 1000-GenNet biots will need to be reduced by many orders of magnitude if they are to be built at all. It is for this reason that Darwin Machines may prove to be a breakthrough in biotic design. When molecular scale technologies come on line in the late 1990s, the Darwin Machine approach will probably be the only way to build self assembling, self testing molecular scale devices.

142 citations


Proceedings ArticleDOI
05 Apr 1993
TL;DR: Virtual computing is an entirely new form of supercomputing that allows an algorithm to be implemented in hardware and is hyper-scalable meaning they scale up better than 1-1.
Abstract: Virtual computing is an entirely new form of supercomputing that allows an algorithm to be implemented in hardware. Based on the Xilinx FPGA and ICube's FPID the Virtual Computer is completely reconfigurable in every respect. Computing machines based on reconfigurable logic are hyper-scalable meaning they scale up better than 1-1. >

136 citations


Journal ArticleDOI
TL;DR: The Realizer, is a logic emulation system that automatically configures a network of field-programmable gate arrays (FPGAs) to implement large digital logic designs, is presented and its interconnection architecture, called the partial crossbar, greatly reduces system-level placement and routing complexity.
Abstract: The Realizer, is a logic emulation system that automatically configures a network of field-programmable gate arrays (FPGAs) to implement large digital logic designs, is presented. Logic and interconnect are separated to achieve optimum FPGA utilization. Its interconnection architecture, called the partial crossbar, greatly reduces system-level placement and routing complexity, achieves bounded interconnect delay, scales linearly with pin count, and allows hierarchical expansion to systems with hundreds of thousands of FPGA devices in a fast and uniform way. An actual multiboard system has been built, using 42 Xilinx XC3090 FPGAs for logic. Several designs, including a 32-b CPU datapath, have been automatically realized and operated at speed. They demonstrate very good FPGA utilization. The Realizer has applications in logic verification and prototyping, simulation, architecture development, and special-purpose execution. >

136 citations


Proceedings ArticleDOI
05 Apr 1993
TL;DR: A novel computation mechanism called the WASMII, which executes a target data flow graph directly, is proposed on the basis of the virtual hardware, and a highly parallel system can be easily constructed.
Abstract: Virtual hardware is a technique to realize a large digital circuit with a small real hardware by using an extended Field Programmable Gate Array (FPGA) technology. Several configuration RAM modules are provided inside the FPGA chip, and the configuration of the gate array can be rapidly changed by replacing the active module. Data for configuration are transferred from an off-chip backup RAM to an unused configuration RAM module. A novel computation mechanism called the WASMII, which executes a target data flow graph directly, is proposed on the basis of the virtual hardware. A WASMII chip consists of the FPGA for virtual hardware and the additional mechanism to replace configuration RAM modules in the data driven manner. Configuration data are preloaded by the order which is assigned in advance with a static scheduling preprocessor. By connecting a number of WASMII chips, a highly parallel system can be easily constructed. >

123 citations


Journal ArticleDOI
S. Trimberger1
01 Jul 1993
TL;DR: A field-programmable gate array (FPGA) that can implement thousands of gates of logic, has no up-front fixed costs, and can be programmed in a few minutes by writing into on-chip static memory is described.
Abstract: A field-programmable gate array (FPGA) can implement thousands of gates of logic, has no up-front fixed costs, and can be programmed in a few minutes by writing into on-chip static memory is described. This kind of FPGA can be reprogrammed any number of times, providing a versatile platform for rapid hardware implementation. Reprogrammable technology allows software-like design methodologies to be applied to logic design. The construction of this kind of FPGA, design tradeoffs, and examples of applications that take advantage of reprogrammability are examined. >

Proceedings ArticleDOI
05 Apr 1993
TL;DR: A processor with multiple reconfigurable execution units has been designed and implemented and is able to compute the new state of 100'000'000 cells of Conway's game of life per second with a clock speed of 6.25 MHz.
Abstract: A processor with multiple reconfigurable execution units has been designed and implemented. The reconfigurable execution units are implemented using reprogrammable field programmable gate array (FPGA) chips. The architecture and implementation of this processor are described in detail. An example shows that this reconfigurable processor is able to compute the new state of 100'000'000 cells of Conway's game of life per second with a clock speed of 6.25 MHz. >

Proceedings ArticleDOI
03 Oct 1993
TL;DR: The architecture of Splash 2 is designed to accelerate the solution of problems which exhibit at least modest amounts of temporal or data parallelism, and has been shown to be effective on a variety of applications, including text searching, sequence analysis, and image processing.
Abstract: Splash 2 is an attached parallel processor in which the computing elements are user-programmable FPGA devices. The architecture of Splash 2 is designed to accelerate the solution of problems which exhibit at least modest amounts of temporal or data parallelism. Applications are developed by writing descriptions of algorithms in VHDL, which are then iteratively refined and debugged within a simulator. Once an application is determined to be functionally correct in simulation, it is compiled to a gate list and optimized by logic synthesis. The gate list is then mapped onto the FPGA architecture by automatic placement and routing tools to form a loadable FPGA object module. A C language library and a symbolic debugger comprise the execution environment. The Splash 2 system has been shown to be effective on a variety of applications, including text searching, sequence analysis, and image processing. >

Journal ArticleDOI
Jonathan W. Greene, E. Hamdy1, S. Beal1
01 Jul 1993
TL;DR: A brief survey of antifuse technologies is provided in this paper, where the tradeoffs involving the antifuses characteristics, routing architecture and logic module are illustrated, as well as some inherent tradeoff involving the Antifuse characteristics and routing architecture are illustrated.
Abstract: An antifuse is an electrically programmable two-terminal device with small area and low parasitic resistance and capacitance. Field-programmable gate arrays (FPGAs) using antifuses in a segmented channel routing architecture now offer the digital logic capabilities of an 8000-gate conventional gate array and system speeds of 40-60 MHz. A brief survey of antifuse technologies is provided. the antifuse technology, routing architecture, logic module, design automation, programming, testing and use of ACT antifuse FPGAs are described. Some inherent tradeoffs involving the antifuse characteristics, routing architecture and logic module are illustrated. >

Proceedings ArticleDOI
09 May 1993
TL;DR: It is concluded that, in this redundancy scheme, the sufficient number of spare rows is one or two for practical cases and the gross yield product can be doubled at an early stage of production.
Abstract: A redundancy scheme and circuitry for field programmable gate arrays (FPGAs) are proposed. The scheme requires the modification of the wiring resource segmentation and the addition of spare rows and selector circuits. An improved yield gross product is quantitatively studied. The disadvantages caused by this architecture, such as an area overhead and speed degradation, are discussed. It is concluded that, in this redundancy scheme, the sufficient number of spare rows is one or two for practical cases and the gross yield product can be doubled at an early stage of production. The proposed scheme can be applicable to a wide range of FPGA architectures.

Journal ArticleDOI
TL;DR: A number of implementation issues that determine a systolic array's performance efficiency, such as algorithms and mapping, system integration through memory subsystems, cell granularity, and extensibility to a wide variety of topologies are described.
Abstract: The extension of systolic array architecture from fixed- or special-purpose architectures to general-purpose, SIMD (single-instruction stream, multiple-data stream), MIMD (multiple-instruction stream, multiple-data stream) architectures, and hybrid architectures that combine both commercial and FPGA (field-programmable gate array) technologies is chronicled. The authors present a taxonomy for systolic organizations, discuss each architecture's methods of exploiting concurrencies, and compare performance attributes of each. The authors also describe a number of implementation issues that determine a systolic array's performance efficiency, such as algorithms and mapping, system integration through memory subsystems, cell granularity, and extensibility to a wide variety of topologies. >

Proceedings ArticleDOI
05 Apr 1993
TL;DR: The architecture of Splash 2 is designed to accelerate the solution of problems which exhibit at least modest amounts of temporal or data parallelism, and has been shown to be effective on a variety of applications, including text searching, sequence analysis, and image processing.
Abstract: Splash 2 is an attached special purpose parallel processor in which the computing elements are user programmable FPGA devices. The architecture of Splash 2 is designed to accelerate the solution of problems which exhibit at least modest amounts of temporal or data parallelism. Applications are developed by writing behavioral descriptions of algorithms in VHDL, which are then iteratively refined and debugged within the Splash 2 simulator. Once an application is determined to be functionally correct in simulation, it is compiled to a gate list and optimized by logic synthesis. The gate list is then mapped onto the FPGA architecture by automatic placement and routing tools to form a loadable FPGA object module. A C language library and a symbolic debugger comprise the execution environment. The Splash 2 system has been shown to be effective on a variety of applications, including text searching, sequence analysis, and image processing. >

Patent
12 Feb 1993
TL;DR: In this paper, a fault tolerant IC device is made from a wafer of field programmable gate arrays (FGPA's) and a bit-stream generator is used to generate configuration data to program each FGPA to perform it's desired function.
Abstract: A fault tolerant IC device is made from a wafer of field programmable gate arrays (FGPA's). Each FGPA is first tested and a wafer map of defective FGPA locations is recorded. A hardware description defines desired circuit operation either via a schematic or a functional description such as a equation or a formula. The hardware description is compiled into a list of required wafer resources and a partitioner allocates this list among the resources available in the FGPA's on the wafer. A automatic router then interconnects to implement the circuit function using the wafer map to avoid all defective FGPA locations. A bit-stream generator then generates the configuration data to program each FGPA to perform it's desired function. The resulting wafer-scale circuit is wafer fault tolerant since the programming avoids and non-functional portions of the wafer. Possible embodiments include XILINX FGPAs, custom wafers with FGPAs and special circuitry and wafers having FGPAs programmed to form RISC processors.

Proceedings ArticleDOI
05 Apr 1993
TL;DR: The authors examine the architectural tradeoffs involved in designing general purpose FPGA-based computing systems with field-programmable gate arrays and field- programmable interconnects.
Abstract: Reprogrammable Field-Programmable Gate Arrays (FPGAs) have enabled the realization of high-performance and affordable reconfigurable computing engines. The authors examine the architectural tradeoffs involved in designing general purpose FPGA-based computing systems with field-programmable gate arrays and field-programmable interconnects. The fact that FPGAs provide both programmable logic and programmable interconnects raises numerous design issues that need to be considered with care. Factors that influence the tradeoffs are routability, rearrangeability and speed. >

Proceedings ArticleDOI
27 Sep 1993
TL;DR: Based on these characteristics, FPGA kernel architectures and related technologies are identified that will significantly improve the performance and capabilities of ASIC hardware emulators.
Abstract: The increasing complexity available in application-specific integrated circuits (ASICs) and the requirement for reduced design time have increased the importance of hardware emulators for ASIC system verification prior to first silicon. The authors review current ASIC hardware emulators based on field programmable gate arrays (FPGAs), including their limitations and constraints. Based on these characteristics, FPGA kernel architectures and related technologies are identified that will significantly improve the performance and capabilities of ASIC hardware emulators. >

Journal ArticleDOI
TL;DR: A stochastic model that facilitates exploration of a wide range of FPGA routing architectures using a theoretical approach is described and the routability predictions from the model are validated by comparing them with the results of a previously published experimental study on FPGa routability.
Abstract: One area of particular importance is the design of an FPGA routing architecture, which houses the user-programmable switches and wires that are used to interconnect the FPGAs logic resources. Because the routing switches consume significant chip area and introduce propagation delays, the design of the routing architecture greatly influences both the area utilization and speed performance of an FPGA. FPGA routing architectures have already been studied using experimental techniques. This paper describes a stochastic model that facilitates exploration of a wide range of FPGA routing architectures using a theoretical approach. In the stochastic model an FPGA is represented as an N*N array of logic blocks separated by both horizontal and vertical routing channels, similar to a Xilinx FPGA. A circuit to be routed is represented by additional parameters that specify the total number of connections, and each connection's length and trajectory. The stochastic model gives an analytic expression for the routability of the circuit in the FPGA. Practically speaking, routability can be viewed as the likelihood that a circuit can be successfully routed in a given FPGA. The routability predictions from the model are validated by comparing them with the results of a previously published experimental study on FPGA routability. >

Proceedings ArticleDOI
Nam Sung Woo1, Jaeseok Kim
01 Jul 1993
TL;DR: The MP2 method is described, which is the first improvement method that considers pin constraints of blocks and has been applied to partitioning technology-mapped circuits into multiple FPGA chips.
Abstract: We developed a new method, called MP2, for partitioning networks into multiple (> 2) blocks each of which has both size and pin constraints. The MP2 method uses an improvement approach and tries to minimize the total number of terminals of all blocks while satisfying the pin and size constraints of every block. It supports multiple classes of cells in input networks and blocks. It makes use of a scalar value of benefit which captures lookahead information. It is the first improvement method that considers pin constraints of blocks. It has been applied to partitioning technology-mapped circuits into multiple FPGA chips. In addition to describing the MP2 method, we will discuss some interesting findings we gleaned during our experiments.

Patent
02 Nov 1993
TL;DR: A field programmable gate array (FPGA) as discussed by the authors consists of a plurality of circuit blocks each having logic circuits, at least one spare circuit block having logic circuit, and a set of interconnections including at least a connecting element disposed on the interconnection of the set of circuits which turns its status from a turned on state to a turned off state or vice versa when programmed.
Abstract: A field programmable gate array, comprises: a plurality of circuit blocks each having logic circuits; at least one spare circuit block having logic circuits; a set of interconnections including at least one interconnection for connecting at least one of the circuit blocks and the at least one spare circuit programably; and at least one connecting element disposed on the interconnection of the set of interconnections which turns its status from a turned-on state to a turned-off state or vice versa when programmed When any one of the circuit blocks is defective, since the defective circuit block can be replaced with the spare circuit block, it is possible to retain any desired functions of the logic circuits by programming the connecting means, thus improving the production yield of the field programmable gate array and thereby reducing the manufacturing cost thereof

Patent
08 Dec 1993
TL;DR: In this paper, a method for compressing configuration bitstreams used to program Field Programmable Gate Arrays (FPGAs) and for decreasing the amount of time necessary to configure FPGAs is presented.
Abstract: Apparatus and method for compressing configuration bitstreams used to program Field Programmable Gate Arrays (FPGAs) and for decreasing the amount of time necessary to configure FPGAs. In a first embodiment of the present invention, a shift register is employed that enables data bits to be shifted multiple positions per clock cycle through the shift register. As a result, the amount of time required to shift the data bits through the shift register can be reduced by 1/N, where N is the number of positions per clock cycle. The shift register also permits the option of shifting bits through the shift register one bit per clock cycle. In a second embodiment of the present invention, control and address bits are employed to more efficiently manage and reduce the size of the configuration bitstream. Accordingly, one embodiment provides the option of loading data into the array of the FPGA by address column in a non-sequential fashion. In other words, to streamline loading of data into the array from the data shift register, the present invention permits non-sequential writing of frames into the array by column address. Another preferred embodiment of the present invention, permits a previous frame of data (repetitive data) to be loaded into the array without having to resupply the data shift register with the repetitive data. Simple logic control bits indicate how frames of data are to be managed.

Proceedings ArticleDOI
Erik Brunvand1
05 Jan 1993
TL;DR: The NSR processor is a general-purpose computer structured as a collection of self-timed blocks that operate concurrently and cooperate by communicating with other blocks using self- Timed communication protocols.
Abstract: The NSR processor is a general-purpose computer structured as a collection of self-timed blocks. These blocks operate concurrently and cooperate by communicating with other blocks using self-timed communication protocols. The blocks that make up the NSR processor correspond to standard synchronous pipeline stages such as instruction fetch, instruction decode, execute, memory interface and register file, but each operates concurrently as a separate self-timed process. In addition to being internally self-timed, the units are decoupled through self-timed first-in first-out (FIFO) queues between each of the units which allows a high degree of overlap in instruction execution. Branches, jumps, and memory accesses are also decoupled through the use of additional FIFO queues which can hide the execution latency of these instructions. A prototype implementation of the NSR processor has been constructed using Actel FPGAs (field programmable gate arrays). >

01 Jan 1993
TL;DR: This paper discusses how complete processors might be fabricated with a minimum of “fixed” or static logic and considers what evolutions of current logic families would favour this type of application.
Abstract: Recent developments in the design and fabrication of field programmable logic devices (FPGA ’s) may well change the way in which we design and fabricate conventional microprocessors. The use of uncommitted logic whose function may be modified at run tame makes the prospect of dynamic application specific integrated circuits closer to reality than ever before. Much of the work to date on reconfigurable logic has focussed on its application in co-processor and “glue ’’ roles. This paper discusses how complete processors might be fabricated with a minimum of “fixed” or static logic. It is shown that an order to exploit FPGAs, a processor that as radically different from conventional architectures is required. The paper concludes by considering what evolutions of current logic families would favour this type of application.

01 Jul 1993
TL;DR: The three most popular types of FPGA architectures are considered, namely those using logic blocks based on lookuptables, multiplexers and wide AND/OR arrays, and the emphasis is on tools which attempt to minimize the area of the combinational logic part of a design.
Abstract: Field programmable gate arrays (FPGA ’s) reduce the turnaround time of application-spec@c integrated circuits from weeks to minutes. However, the high complexity of their architectures makes manual mapping of designs time consuming and error prone thereby offsetting any turnaround advantage. Consequently, effective design automation tools are needed to reduce design time. Among the most important is logic synthesis. While standard synthesis techniques could be used for FPGA’s, the quality of the synthesized designs is often unacceptable. As a result, much recent work has been devoted to developing logic synthesis tools targeted to different FPGA architectures. The paper surveys this work. The three most popular types of FPGA architectures are considered, namely those using logic blocks based on lookuptables, multiplexers and wide AND/OR arrays. The emphasis is on tools which attempt to minimize the area of the combinational logic part of a design since little work has been done on optimizing performance or routability, or on synthesis of the sequential part of a design. The different tools surveyed are compared using a suite of benchmark designs.

Book ChapterDOI
24 May 1993
TL;DR: This paper shows how to compile a program written in a subset of occam into a normal form suitable for further processing into a netlist of components which may be loaded into a Field-Programmable Gate Array (FPGA).
Abstract: This paper shows how to compile a program written in a subset of occam into a normal form suitable for further processing into a netlist of components which may be loaded into a Field-Programmable Gate Array (FPGA). A simple state-machine model is adopted for specifying the behaviour of a synchronous circuit where the observable includes the state of the control path and the data path of the circuit. We identify the behaviour of a circuit with a program consisting of a very restricted subset of occam. Algebraic laws are used to facilitate the transformation from a program into a normal form. The compiling specification is presented as a set of theorems that must be proved correct with respect to these laws. A rapid prototype compiler in the form of a logic program may be implemented from these theorems.

Proceedings ArticleDOI
05 Apr 1993
TL;DR: The authors demonstrate a new technique for automatically synthesizing digital logic from a high level algorithmic description in a data parallel language using the Splash 2 reconfigurable logic arrays for programs written in Data-parallel Bit-serial C.
Abstract: The authors demonstrate a new technique for automatically synthesizing digital logic from a high level algorithmic description in a data parallel language. The methodology has been implemented using the Splash 2 reconfigurable logic arrays for programs written in Data-parallel Bit-serial C (dbC). The translator generates a VHDL description of a SIMD processor array with one or more processors per Xilinx 4010 FPGA. The instruction set of each processor is customized to the dbC program being processed. In addition to the usual arithmetic operations, nearest neighbor communication, host-to-processor communication, and global reductions are supported. >

Journal ArticleDOI
01 Aug 1993
TL;DR: A representative DSP design application in the form of an 8 tap FIR filter is offered for the Xilinx XC3042 field programmable logic array (FPGA) to obtain a realistic placing and routing of configurable logic block and in/out block components.
Abstract: Distributed arithmetic techniques are the key to efficient implementation of DSP algorithms in FPGAs. The distributed arithmetic process is briefly described. A representative DSP design application in the form of an 8 tap FIR filter is offered for the Xilinx XC3042 field programmable logic array (FPGA). The design is presented in sufficient detail--from filter specifications via filter design software through detailed logic of salient data and control functions to obtain a realistic placing and routing of configurable logic block (CLBs) and in/out block (IOBs) components for simulation verification and performance evaluation vis-a-vis commercially available dedicated 8 tap FIR filter chips.