scispace - formally typeset
Search or ask a question

Showing papers on "Reconfigurable computing published in 1995"


Patent
Michael A. Baxter1
17 Apr 1995
TL;DR: In this paper, the authors present a system for scalable, parallel, dynamically reconfigurable computing, which includes a set of S-machines, a T-machine corresponding to each S-machine, a General Purpose Interconnect Matrix (GPIM), a Set of I/O T-mACHines, and a master time-base unit.
Abstract: A set of S-machines, a T-machine corresponding to each S-machine, a General Purpose Interconnect Matrix (GPIM), a set of I/O T-machines, a set of I/O devices, and a master time-base unit form a system for scalable, parallel, dynamically reconfigurable computing. Each S-machine is a dynamically reconfigurable computer having a memory, a first local time-base unit, and a Dynamically Reconfigurable Processing Unit (DRPU). The DRPU is implemented using a reprogrammable logic device configured as an Instruction Fetch Unit (IFU), a Data Operate Unit (DOU), and an Address Operate Unit (AOU), each of which are selectively reconfigured during program execution in response to a reconfiguration interrupt or the selection of a reconfiguration directive embedded within a set of program instructions. Each reconfiguration interrupt and each reconfiguration directive references a configuration data set specifying a DRPU hardware organization optimized for the implementation of a particular Instruction Set Architecture (ISA). The IFU directs reconfiguration operations, instruction fetch and decode operations, memory access operations, and issues control signals to the DOU and the AOU to facilitate instruction execution. The DOU performs data computations, and the AOU performs address computations. Each T-machine is a data transfer device having a common interface and control unit, one or more interconnect I/O units, and a second local time-base unit. The GPIM is a scalable interconnect network that facilitates parallel communication between T-machines. The set of T-machines and the GPIM facilitate parallel communication between S-machines.

182 citations


Journal ArticleDOI
TL;DR: Automatic mapping tools for Triptych, an FPGA architecture with improved logic density and performance over commercial FPGAs, and extensions to these algorithms for mapping asynchronous circuits to Montage, the first FGPA architecture to completely support asynchronous and synchronous interface applications are described.
Abstract: Field-programmable gate arrays (FPGAs) are becoming an increasingly important implementation medium for digital logic. One of the most important keys to using FPGAs effectively is a complete, automated software system for mapping onto the FPGA architecture. Unfortunately, many of the tools necessary require different techniques than traditional circuit implementation options, and these techniques are often developed specifically for only a single FPGA architecture. In this paper we describe automatic mapping tools for Triptych, an FPGA architecture with improved logic density and performance over commercial FPGAs. These tools include a simulated-annealing placement algorithm that handles the routability issues of fine-grained FPGAs, and an architecture-adaptive routing algorithm that can easily be retargeted to other FPGAs. We also describe extensions to these algorithms for mapping asynchronous circuits to Montage, the first FPGA architecture to completely support asynchronous and synchronous interface applications.

177 citations


Book ChapterDOI
01 Sep 1995
TL;DR: Two basic implementation approaches with FPGAs: compiletime reconfiguration and run-time reconfigurement are discussed and existing applications for each strategy are discussed.
Abstract: Reconfigurable FPGAs provide designers with new implementation approaches for designing high-performance applications. This paper discusses two basic implementation approaches with FPGAs: compiletime reconfiguration and run-time reconfiguration. Compile-time reconfiguration is a static implementation strategy where each application consists of one configuration. Run-time reconfiguration is a dynamic implementation strategy where each application consists of multiple cooperating configurations. This paper introduces these strategies and discusses the implementation approaches for each strategy. Existing applications for each strategy are also discussed.

124 citations


Proceedings ArticleDOI
01 Aug 1995
TL;DR: A datapath synthesis system (DPSS) for the reconfigurabledatapath architecture (rDPA) that allows automatic mapping of high level descriptions onto the rDPA without manual interaction is presented.
Abstract: A datapath synthesis system (DPSS) for the reconfigurable datapath architecture (rDPA) is presented. The DPSS allows automatic mapping of high level descriptions onto the rDPA without manual interaction. The required algorithms of this synthesis system are described in detail. Optimization techniques like loop folding or loop unrolling are sketched. The rDPA is scalable to arbitrarily large arrays and reconfigurable to be adaptable to the computational problem. Fine grained parallelism is achieved by using simple reconfigurable processing elements which are called datapath units (DPUs). The rDPA can be used as a reconfigurable ALU for bus oriented systems as well as for rapid prototyping of high speed datapaths.

118 citations


Journal ArticleDOI
TL;DR: Triptych is presented, an FPGA architecture designed to achieve improved logic density with competitive performance by allowing a per-mapping tradeoff between logic and routing resources, and with a routing scheme designed to match the structure of typical circuits.
Abstract: Field-programmable gate arrays (FPGAs) are an important implementation medium for digital logic. Unfortunately, they currently suffer from poor silicon area utilization due to routing constraints. In this paper we present Triptych, an FPGA architecture designed to achieve improved logic density with competitive performance. This is done by allowing a per-mapping tradeoff between logic and routing resources, and with a routing scheme designed to match the structure of typical circuits. We show that, using manual placement, this architecture yields a logic density improvement of up to a factor of 3.5 over commercial FPGAs, with comparable performance. We also describe Montage, the first FPGA architecture to fully support asynchronous and synchronous interface circuits.

115 citations


Patent
26 May 1995
TL;DR: In this paper, the authors present an apparatus and method for decreasing the amount of time necessary to load configuration data into Field Programmable Gate Arrays (FPGAs) or other integrated circuit devices.
Abstract: An apparatus and method for decreasing the amount of time necessary to load configuration data into Field Programmable Gate Arrays (FPGAs) or other integrated circuit devices. In a preferred embodiment, serially arrayed FPGAs receive a concatenated stream of data from a common data bus. As a first FPGA reaches a loading-complete state, an enabling token is passed from the first FPGA to an enabling input on the next FPGA. The process repeats until all devices are completely loaded or fully configured.

110 citations


Proceedings ArticleDOI
19 Sep 1995
TL;DR: A dynamic instruction set computer (DISC) has been developed to support demand-driven instruction set modification and enhances the functional density of FPGAs by physically relocating instruction modules to available FPGA space.
Abstract: A Dynamic Instruction Set Computer (DISC) has been developed to support demand-driven instruction set mod-ification. Using partial reconfiguration, DISC pages instruction modules in and out of an FPGA as demanded bythe executing program. Instructions occupy FPGA resources only when needed and FPGA resources can be reusedto implement an arbitrary number of performance-enhancing application-specific instructions. DISC further en-hances the functional density of FPGAs by physically relocating instruction modules to available FPGA space. Animage processing application was developed on DISC to demonstrate the advantages of paging application-specificinstruction modules.Keywords: FP GA processor, run-time reconfiguration, relocatable hardware, application-specific processor 1 INTRODUCTION For many digital systems, general purpose processors do not provide sufficient processing power to operateacceptably in real-time environments. Specialized computing resources, such as digital signal processors andapplication-specific processors, are often used to improve available computation. A relatively new approach toimproving the available computing power of embedded systems is to implement application specific circuits withField Programmable Gate Arrays (FPGAs).Although more expensive than custom circuits, FPGAs provide a simplified, low-cost design environment.

108 citations


Journal ArticleDOI
TL;DR: The results of this work suggest that run-time reconfiguration is a powerful technique with potential for a wide range of video applications in which temporal algorithm partitioning and rapid adaptivity are feasible and desired.
Abstract: Video coding has been implemented by using rapid reconfiguration to time share hardware for several sequential stages. This allows the chip area to be reduced by a factor proportional to the number of coding stages at the expense of some reconfiguration overhead and the added memory and control needed to implement reconfiguration. The results of this work suggest that run-time reconfiguration is a powerful technique with potential for a wide range of video applications in which temporal algorithm partitioning and rapid adaptivity are feasible and desired.

97 citations


Patent
24 Oct 1995
TL;DR: In this paper, a method for programming an FPGA memory includes the steps of downloading a first set of data into the FPGAs, and then downloading a second set of information into the memory, the second set being of a type for programming the FFPA to perform one or more logic functions.
Abstract: A programmable logic circuit includes a non-volatile, in-circuit programmable FPGA memory containing configuration data for programming an FPGA to perform one or more desired logic functions. The chips in the circuit may be packaged individually or may be mounted in die form on a multichip module. A method for programming an FPGA memory includes the steps of downloading a first set of data into an FPGA to program the FPGA to function as an FPGA memory programmer, and then downloading a second set of data into the FPGA, the second set of data being of a type for programming the FPGA to perform one or more logic functions. The second set of data, however, is not immediately loaded into the FPGA but instead is programmed into the FPGA memory in a manner controlled by the programming function of the FPGA. A method for programming an FPGA includes as a step the aforementioned method. A multichip module contains an FPGA die and FPGA memory die, either of which may be programmed in accordance with the methods described above.

85 citations


Book ChapterDOI
01 Sep 1995
TL;DR: The Splash 2 Parallel Genetic Algorithm (SPGA) is described, which is a parallel genetic algorithm for optimizing symmetric traveling salesman problems (TSPs) using Splash 2 and the four-processor island-parallel SPGA implementation out performed all other SPGA configurations tested.
Abstract: With the introduction of Splash, Splash 2, PAM, and other reconfigurable computers, a wide variety of algorithms can now be feasibly constructed in hardware. In this paper, we describe the Splash 2 Parallel Genetic Algorithm (SPGA), which is a parallel genetic algorithm for optimizing symmetric traveling salesman problems (TSPs) using Splash 2. Each processor in SPGA consists of four Field Programmable Gate Arrays (FPGAs) and associated memories and was found to perform 6.8 to 10.6 times the speed of equivalent software on a state-of-the-art workstation. Multiple processor SPGA systems, which use up to eight processors, find good TSP solutions much more quickly than single processor and software-based implementations of the genetic algorithm. The four-processor island-parallel SPGA implementation out performed all other SPGA configurations tested. We conclude noting that the described parallel genetic algorithm appears to be a good match for reconfigurable computing machines and that Splash 2's various interconnect resources and support for linear systolic and MIMD computing models was important for the implementation of SPGA.

82 citations


Proceedings ArticleDOI
19 Apr 1995
TL;DR: It is shown that a generic interface board can be readily adapted to three quite different CCD image sensors, and the structure of the CCD interface configurations and the design issues that arose during their development are presented.
Abstract: We describe the use of a reconfigurable interface board based on FPGAs in a high bandwidth image acquisition system. We show that a generic interface board can be readily adapted to three quite different CCD image sensors. Our guiding philosophy is to implement the entire interface in programmable logic. We depart from this principle only for electrical adaptation which is performed by relatively simple daughter boards. The CCDs we describe are used for data collection and telescope control at the Swedish Vacuum Solar Telescope located at the Rogue de los Muchachos, La Palma, Canary Islands, where various versions of our interface have been in use since May 1993. Reconfigurable solutions are particularly well suited to this application which is inherently low volume and requires many minor variants in response to changing experimental conditions. The CCDs used deliver image data at 10-30 MB/s. Our interfaces cope comfortably with multiple CCDs operating continuously at this rate on a single host, while also performing low bandwidth servo control tasks. We present the structure of the CCD interface configurations and the design issues that arose during their development. Reconfigurability allows the interface configurations and supporting software to be developed together leading to efficient, tightly coupled, total system solutions.

Proceedings ArticleDOI
23 Nov 1995
TL;DR: This paper proposes a programming scheme called block-sliced loading, which makes FPGAs C-testable, and presents two types of programming schemes; sequential loading and random access loading.
Abstract: A field-programmable gate array (FPGA) can implement arbitrary logic circuits in the field. In this paper we consider universal test such that when applied to an unprogrammed FPGA, it ensures that all the corresponding programmed logic circuits on the FPGA are fault-free. We focus on testing for look-up tables in FPGAs, and present two types of programming schemes; sequential loading and random access loading. Then we show test procedures for the FPGAs with these programming schemes and their test complexities. In order to make the test complexity for FPGAs independent of the array size of the FPGAs, we propose a programming scheme called block-sliced loading, which makes FPGAs C-testable.

Journal ArticleDOI
01 May 1995
TL;DR: This paper presents a brief introduction to custom computing machines and some of the technologies used in the design and development of these machines.
Abstract: We present a brief introduction to custom computing machines.

Book ChapterDOI
02 Oct 1995
TL;DR: This paper is an introduction to FPGAs, presenting differencies with more traditional PLDs and giving a survey of two commercial architectures.
Abstract: Field programmable gate arrays (FPGA) are a recently developed family of programmable circuits. Like mask programmable gate arrays (MPGA), FPGAs implement thousands of logic gates. But, unlike MPGAs, a user can program an FPGA design as traditional programmable logic devices (PLDs): in-site and a in a few seconds. These features, added to reprogrammability, have made FPGAs the dream tool for evolvable hardware. This paper is an introduction to FPGAs, presenting differencies with more traditional PLDs and giving a survey of two commercial architectures.

Journal ArticleDOI
01 May 1995
TL;DR: The data-parallel bit C system that is used to program a reconfigurable logic array as a custom SIMD coprocessor in the dbC language has bit-oriented extensions to C, which give the programmer control over the size of parallel data objects so that the data size requirements of the algorithm can be accurately reflected in the hardware.
Abstract: The existence of reconfigurable logic arrays has made it possible to create on the same physical hardware platform many different virtual circuits: A configuration bit stream loaded into the logic array determines the virtual circuit that is to be emulated. The virtual logic can be used to create new instructions that supplement the instruction set of a conventional processor (as demonstrated, for example, in (Athanas and Silverman, 1993) and (Agarwal et al., 1994)) or alternatively can operate as a parallel coprocessor, as shown in the Splash systems (Gokhale et al., 1991) and (Gokhale and Minnich, 1993). The ease of hardware reconfiguration has not been accompanied by a corresponding capability in the software environment. In most cases, schematic capture tools, textual hardware description languages, or low-level vendor-specific tools must be used to create new configurations. Research into high-level synthesis for reconfigurable arrays has begun, however, as evidenced by the PRISM-2 tools of (Agarwal et al., 1994) and the dbC translation system of (Gokhale and Minnich, 1993). The ability to write algorithms for these arrays in a high-level procedural language is key to their effective use. In this paper we describe the data-parallel bit C system that is used to program a reconfigurable logic array as a custom SIMD coprocessor. The dbC language has bit-oriented extensions to C, which give the programmer control over the size of parallel data objects so that the data size requirements of the algorithm can be accurately reflected in the hardware. The data parallel extensions provide the ability to define a SIMD processor array; to operate on data in the processors' memories; to perform global reductions over the processor array; and to communicate data between nearest-neighbor processing elements. We describe the major components of our translation system and show how it is mapped onto the Splash 2 system.

Book ChapterDOI
01 Sep 1995
TL;DR: A new Field Programmable Gate Array (FPGA) architecture is described that includes a number of novel features not found in currently available FPGAs.
Abstract: A new Field Programmable Gate Array (FPGA) architecture is described. This architecture includes a number of novel features not found in currently available FPGAs. It is believed to offer a significantly improved logic density in some common applications.

Proceedings ArticleDOI
18 Sep 1995
TL;DR: The design and synthesis of a high-performance coprocessor for point pattern matching with application to fingerprint matching using Splash 2-an attached processor for SUN SPARCstation hosts is described.
Abstract: We describe the design and synthesis of a high-performance coprocessor for point pattern matching with application to fingerprint matching using Splash 2-an attached processor for SUN SPARCstation hosts. Each of the field programmable gate array (FPGA)-based processing elements (PEs) is programmed using VHDL behavioral modeling. Using the simulation tools, the program logic is verified. The final control bit stream for the PEs is generated using the synthesis tools. The point feature matching coprocessor can run at a peak speed of 17.1 MHz per feature vector of a fingerprint. With 65 features per fingerprint, the matching speed has been projected at the rate of 2.6*10/sup 5/ fingerprints/sec. The synthesized coprocessor was tested on a 10000 fingerprint database.

Book ChapterDOI
Adrian Lawrence1, Andrew Kay, Wayne Luk1, Toshio Nomura, Ian Page1 
01 Sep 1995
TL;DR: Harp1 is a circuit board designed to exploit the rigorous compilation of parallel algorithms directly into hardware that includes a transputer closely-coupled to a Field-Programmable Gate Array (FPGA).
Abstract: Harp1 is a circuit board designed to exploit the rigorous compilation of parallel algorithms directly into hardware. It includes a transputer closely-coupled to a Field-Programmable Gate Array (FPGA). The whole system can be regarded as an instance of a process in the theory of Communicating Sequential Processes (CSP). The major elements themselves can also be viewed in the same way: both the transputer and the FPGA can implement many parallel communicating sub-processes. The Harp1 design includes memory banks, a programmable frequency synthesizer and several communication ports. The latter supports the use of parallel arrays of Harp1 boards, as well as interfacing to external hardware. Harp1 is the target of mathematical tools based upon the Ruby and occam languages, which enable unusual and novel applications to be produced and demonstrated correctly and rapidly; the aim is to produce high quality designs at low costs and with reduced development time.

Book ChapterDOI
01 Sep 1995
TL;DR: A VHDL design methodology adapted to FPGAs for achieving optimal synthesis results and implementation of storage elements, finite state machines, and the exploitation of features such as fast-carry logic and built-in RAM are discussed.
Abstract: As synthesis becomes popular for generating FPGA designs, the design style has to be adapted to FPGAs for achieving optimal synthesis results. In this paper, we discuss a VHDL design methodology adapted to FPGA architectures. Implementation of storage elements, finite state machines, and the exploitation of features such as fast-carry logic and built-in RAM are discussed.

Proceedings ArticleDOI
19 Apr 1995
TL;DR: This work presents a dynamic architecture for FPGA based computing systems with field programmable gate arrays and dynamic fieldprogrammable interconnect devices that overcomes FPGa pin limitations, but also greatly increases the routability of interconnect networks, resulting in higher overall performance of FGPA based systems.
Abstract: Field programmable gate arrays (FPGAs) have formed the basis for high performance and affordable computing systems. FPGA based logic simulators can emulate complex logic designs at clock speeds of several orders of magnitude faster than even accelerated software simulators, while FPGA based prototyping systems provide great flexibility in rapid prototyping and system verification. However, besides FPGA pin limitation, existing FPGA based systems also meet the problem of improving the routability of interconnect networks in the architecture design. We present a dynamic architecture for FPGA based computing systems with field programmable gate arrays and dynamic field programmable interconnect devices. Our architecture has advantages on FPGA gate utilization as well as on routability of interconnect networks. The central principle of this new architecture as based on the concept of efficiently exploiting the potential communication bandwidth of interconnect resources. By dynamically reconfiguring the interconnect networks, FPGA pins and interconnect resources are efficiently reused. In this way, this new architecture not only overcomes FPGA pin limitations, but also greatly increases the routability of interconnect networks, resulting in higher overall performance of FPGA based systems.

Proceedings ArticleDOI
19 Sep 1995
TL;DR: In this article, the authors define reconfigurable computing systems as those machines that use the reconfigurability aspects of field programmable gate arrays (FPGAs) to implement an algorithm.
Abstract: We define reconfigurable computing systems as those machines that use the reconfigurable aspects of field programmable gate arrays (FPGAs) to implement an algorithm Researchers throughout the world have shown that computationally intensive software algorithms can be transposed directly into hardware design for extreme performance gain Hardware objects are algorithms implemented as dynamically downloadable hardware designs Hardware objects execute on reconfigurable computing systems based on SRAM-style FPGAs A hardware object can be created via schematic and VHSIC hardware description language or Verilog hardware description language To use a hardware design in a software program, it must be converted into a hardware object The hardware object can be used over and over or in combination with other hardware objects This hardware object technology method of programing reconfigurable computers is the subject of this paper© (1995) COPYRIGHT SPIE--The International Society for Optical Engineering Downloading of the abstract is permitted for personal use only

Proceedings ArticleDOI
15 Feb 1995
TL;DR: In this article, the authors propose the development of families of FPGAs, where each FPGA family is targeted at a single maximum logic capacity, and consists of several siblings, which can be used to implement any given application circuit in the sibling with the most appropriate architecture.
Abstract: In order to narrow the speed and density gap between FPGAs and MPGAs we propose the development of “families” of FPGAs. Each FPGA family is targeted at a single maximum logic capacity, and consists of several “siblings”, or FPGAs of different yet complementary architectures. Any given application circuit is implemented in the sibling with the most appropriate architecture. With properly chosen siblings, one can develop a family of FPGAs which will have better speed and density than any single FPGA. We apply this concept to create two different FPGA families, one composed of architectures with different types of hard-wired logic blocks and the other created from architectures with different types of heterogeneous logic blocks. We found that a family composed of eight chips with different hard-wired logic block architectures simultaneously improves density by 12 to 14% and speed by 18 to 20% over the best single hard-wired FPGA.

Proceedings ArticleDOI
19 Apr 1995
TL;DR: A new architecture for reconfigurable real time signal transport systems that uses FPGAs and describes an experimental system design that can be used not only for the initial installation stages of actual communication systems, but also for rapid prototyping tools or emulators.
Abstract: The paper discusses a new architecture for reconfigurable real time signal transport systems that uses FPGAs and describes an experimental system design. The basic architecture of the reconfigurable transport system is proposed based on the requirements for real time signal transport in a typical telecommunication network. The proposed system consists of reconfigurable modules using the custom designed FPGA called PROTEUS, a program control module for hardware/software coprocessing, and line interface modules including an automatic connection control mechanism. The system can be used not only for the initial installation stages of actual communication systems, but also for rapid prototyping tools or emulators.

Book ChapterDOI
01 Sep 1995
TL;DR: The scope of this paper is limited only to the demonstration of the flexibility of the SRAM-based FPGA architecture to tolerate faults which have been detected and located by means not described herein.
Abstract: A novel application of SRAM-based FPGA technology is the development of fault tolerant systems in which reconfigurability is exploited in order to implement inherent redundancy. The approach is to use SRAM-based FPGA's in a mode where fault tolerance is achieved by detection of a fault and its location, and recovery from the fault via device reconfiguration. The scope of this paper is limited only to the demonstration of the flexibility of the SRAM-based FPGA architecture to tolerate faults which have been detected and located by means not described herein. Computer simulations of random faults and recovery from the faults has been performed. Results are described validating this technique and the success rate in terms of both routability and performance.

Journal ArticleDOI
01 Apr 1995
TL;DR: A novel scheme for performing fixed-point arithmetic efficiently on fine-grain, massively parallel, programmable architectures including both custom and FPGA-based systems is presented, able to match the performance of the bit-parallel methods, while retaining low communication complexity.
Abstract: In this paper, we present a novel scheme for performing fixed-point arithmetic efficiently on fine-grain, massively parallel, programmable architectures including both custom and FPGA-based systems. We achieve anO(n) speedup, wheren is the operand precision, over the bit-serial methods of existing fine-grain systems such as the DAP, the MPP and the CM2, within the constraints of regular, near neighbor communication and only a small amount of on-chip memory. This is possible by means of digit pipelined algorithms which avoid broadcast and which operate in a fully systolic manner by pipelining at the digit level. A base 4, signed-digit, fully redundant number system and on-line techniques are used to limit carry propagation and minimize communication costs. p ]Although our algorithms are digit-serial, we are able to match the performance of the bit-parallel methods, while retaining low communication complexity. Reconfigurable hardware systems built using field programmable gate arrays (FPGA's) can share in the speed benefits of these algorithms. By using the organization of logic blocks suggested in this paper, problems of placement and routing that exist in such systems can be avoided. Since the algorithms are amenable to pipelining, very high throughput can be obtained.

Proceedings ArticleDOI
19 Apr 1995
TL;DR: The implementation of a FPGA based coprocessor and its Programming methodology are shown and the effects of different sequencing models, and regular and irregular circuits on the hardware and in the programming methodology are discussed.
Abstract: The implementation of a FPGA based coprocessor and its programming methodology are shown. The effects of different sequencing models, and regular and irregular circuits on the hardware and in the programming methodology are discussed. Two examples are described: a sorting network and the kernel of a speech recognition algorithm. The results are still preliminary but they suggest some architectural improvements for general FPGA based computing machines.

Proceedings ArticleDOI
25 Jan 1995
TL;DR: The design of a computer architecture for the constraint-check-function-a function of the QSIM kernel is presented and two design strategies are considered to improve the performance.
Abstract: The design of a specialized computer architecture for qualitative simulation is presented. Our interest focuses on the hardware design of an application-specific computer architecture which is composed of programmable processors (digital signal processors TMS320C40) and application-specific integrated circuits (FPGAs). Two design strategies are considered to improve the performance. Primitive functions are hardware-implemented using FPGAs (software/hardware migration). More complex functions are mapped onto a multi processor system formed by TMS320C40. This computer architecture is designed for the well known algorithm for qualitative simulation-QSIM. In this paper we present the design of a computer architecture for the constraint-check-function-a function of the QSIM kernel. >

Proceedings ArticleDOI
01 Dec 1995
TL;DR: This paper presents an efficient approach to the problem of multiway partitioning of large FPGA netlists onto heterogeneous FPGAs, and is able to consider the board architecture.
Abstract: FPGAs are well accepted as an alternative to ASICs and for rapid prototyping purposes. Netlists of designs which are too large to be implemented on a single FPGA, have to be mapped onto a set of FPGAs, which could be organized on an FPGA board containing various FPGAs connected by interconnection networks. This paper presents an efficient approach to the problem of multiway partitioning of large FPGA netlists onto heterogeneous FPGA boards. To optimize the resulting partitioning with respect to the target architecture, our algorithm is able to consider the board architecture.

Proceedings ArticleDOI
19 Sep 1995
TL;DR: The WILDFIRE hardware and the accompanying software environment for application development and runtime operation are presented and suitable applications for WilDFIRE and future capabilities are discussed.
Abstract: WILDFIRE is a commercial reconfigurable computer architecture based on field programmable gatearray (FPGA) technology. Programmers achieve high processing performance by rapidly modifying theinternal hardware architecture through software to efficiently accommodate the specific processing needsof an application. The WILDFIRE hardware and the accompanying software environment forapplications development and runtime operation are presented. Suitable applications for WILDFIRE andfuture capabilities are also discussed.Keywords: computer architecture, FPGA, reconfigurable computing, VHDL 1.0 INTRODUCTION WILDFIRE represents a new vision in commercially available computing technology implemented usingfield programmable gate arrays1 (FPGAs) that allows programmers to configure the architecture of theprocessing elements to exhibit the computational features required by an application. WILDFIRE isbased on the Splash 22 technology transferred from the National Security Agency and the Institute forDefense Analyses, Supercomputing Research Center* (SRC). This technology has effectively solvedparallel processing and rapid prototyping problems on a variety of applications relating to imageprocessing, text searching, and sequence analysis.The discussion begins in Section 2 by describing the motivation for reconfigurable computing. Itcontinues in Section 3 with a history of the Splash 2 project that led to the development of WILDFIRE.Section 4 discusses the system architecture of WILDFIRE. This includes a description of the layout andthe interaction of the hardware and a description of both the applications development and runtime