scispace - formally typeset
Search or ask a question

Showing papers on "Reconfigurable computing published in 2001"


Proceedings ArticleDOI
13 Mar 2001
TL;DR: The paper surveys a decade of R&D on coarse grain reconfigurable hardware and related CAD, points out why this emerging discipline is heading toward a dichotomy of computing science, and advocates the introduction of a new soft machine paradigm to replace CAD by compilation.
Abstract: The paper surveys a decade of R&D on coarse grain reconfigurable hardware and related CAD, points out why this emerging discipline is heading toward a dichotomy of computing science, and advocates the introduction of a new soft machine paradigm to replace CAD by compilation.

661 citations


Journal ArticleDOI
01 May 2001
TL;DR: A survey of academic research and commercial development in reconfigurable computing for DSP systems over the past fifteen years is presented in this article, with a focus on the application domain of digital signal processing.
Abstract: Steady advances in VLSI technology and design tools have extensively expanded the application domain of digital signal processing over the past decade. While application-specific integrated circuits (ASICs) and programmable digital signal processors (PDSPs) remain the implementation mechanisms of choice for many DSP applications, increasingly new system implementations based on reconfigurable computing are being considered. These flexible platforms, which offer the functional efficiency of hardware and the programmability of software, are quickly maturing as the logic capacity of programmable devices follows Moore's Law and advanced automated design techniques become available. As initial reconfigurable technologies have emerged, new academic and commercial efforts have been initiated to support power optimization, cost reduction, and enhanced run-time performance. This paper presents a survey of academic research and commercial development in reconfigurable computing for DSP systems over the past fifteen years. This work is placed in the context of other available DSP implementation media including ASICs and PDSPs to fully document the range of design choices available to system engineers. It is shown that while contemporary reconfigurable computing can be applied to a variety of DSP applications including video, audio, speech, and control, much work remains to realize its full potential. While individual implementations of PDSP, ASIC, and reconfigurable resources each offer distinct advantages, it is likely that integrated combinations of these technologies will provide more complete solutions.

390 citations


Patent
19 Oct 2001
TL;DR: In this article, the authors present a real-time tool to select a set of software modules and hardware configuration files from a series of libraries, which are then downloaded to the reconfigurable hardware.
Abstract: A reconfigurable test system including a host computer coupled to a reconfigurable test instrument. The reconfigurable test instrument includes reconfigurable hardware—i.e. a reconfigurable hardware module with one or more programmable elements such as Field Programmable Gate Arrays for realizing an arbitrary hardware architecture and a reconfigurable front end with programmable transceivers for interfacing with any desired physical medium—and optionally, an embedded processor. A user specifies system features with a software configuration utility which directs a component selector to select a set of software modules and hardware configuration files from a series of libraries. The modules are embedded in a host software driver or downloaded for execution on the embedded CPU. The configuration files are downloaded to the reconfigurable hardware. The entire selection process is performed in real-time and can be changed whenever the user deems necessary. Alternatively, the user may create a graphical program in a graphical programming environment and compile the program into various software modules and configuration files for host execution, embedded processor execution, or programming the reconfigurable hardware.

212 citations


Journal ArticleDOI
TL;DR: This contribution proposes arithmetic architectures which are optimized for modern field programmable gate arrays (FPGAs) that perform modular exponentiation with very long integers, at the heart of many practical public-key algorithms such as RSA and discrete logarithm schemes.
Abstract: It is widely recognized that security issues will play a crucial role in the majority of future computer and communication systems. Central tools for achieving system security are cryptographic algorithms. This contribution proposes arithmetic architectures which are optimized for modern field programmable gate arrays (FPGAs). The proposed architectures perform modular exponentiation with very long integers. This operation is at the heart of many practical public-key algorithms such as RSA and discrete logarithm schemes. We combine a high-radix Montgomery modular multiplication algorithm with a new systolic array design. The designs are flexible, allowing any choice of operand and modulus. The new architecture also allows the use of high radices. Unlike previous approaches, we systematically implement and compare several variants of our new architecture for different bit lengths. We provide absolute area and timing measures for each architecture. The results allow conclusions about the feasibility and time-space trade-offs of our architecture for implementation on commercially available FPGAs. We found that 1,024-bit RSA decryption can be done in 3.1 ms with our fastest architecture.

196 citations


Book ChapterDOI
14 May 2001
TL;DR: This paper presents an evaluation of the Rijndael cipher from the viewpoint of its implementation in a Field Programmable Devices (FPD) and results obtained are significantly faster than that of other implementations known up to now.
Abstract: This paper presents an evaluation of the Rijndael cipher, the Advanced Encryption Standard winner, from the viewpoint of its implementation in a Field Programmable Devices (FPD). Starting with an analysis of algorithm's general characteristics a general cipher structure is described. Two different methods of Rijndael algorithm mapping to FPD are analyzed and suitability of available FPD families is evaluated. Finally, results of proposed mapping implemented in Altera FLEX, ACEX and APEX FPD are presented and compared with the fastest known Xilinx FPGA implementation. Results obtained are significantly faster than that of other implementations known up to now.

159 citations


Proceedings ArticleDOI
01 Feb 2001
TL;DR: In mapping the k-means algorithm to FPGA hardware, this work examined algorithm level transforms that dramatically increased the achievable parallelism and also examined the effects of using fixed precision and truncated bit widths in the algorithm.
Abstract: In mapping the k-means algorithm to FPGA hardware, we examined algorithm level transforms that dramatically increased the achievable parallelism. We apply the k-means algorithm to multi-spectral and hyper-spectral images, which have tens to hundreds of channels per pixel of data. K-means is an iterative algorithm that assigns assigns to each pixel a label indicating which of K clusters the pixel belongs to.K-means is a common solution to the segmentation of multi-dimensional data. The standard software implementation of k-means uses floating-point arithmetic and Euclidean distances. Floating point arithmetic and the multiplication-heavy Euclidean distance calculation are fine on a general purpose processor, but they have large area and speed penalties when implemented on an FPGA. In order to get the best performance of k-means on an FPGA, the algorithm needs to be transformed to eliminate these operations. We examined the effects of using two other distance measures, Manhattan and Max, that do not require multipliers. We also examined the effects of using fixed precision and truncated bit widths in the algorithm.It is important to explore algorithmic level transforms and tradeoffs when mapping an algorithm to reconfigurable hardware. A direct translation of the standard software implementation of k-means would result in a very inefficient use of FPGA hardware resources. Analysis of the algorithm and data is necessary for a more efficient implementation. Our resulting implementation exhibits approximately a 200 times speed up over a software implementation.

152 citations


Journal ArticleDOI
TL;DR: By enabling multiple applications to be dynamically loaded into a single hardware device, the DHP architecture provides a scalable mechanism for implementing high-performance programmable routers.

135 citations


Proceedings ArticleDOI
30 Jan 2001
TL;DR: The paper gives a brief survey over a decade of R&D on coarse grain reconfigured hardware and related compilation techniques and points out its significance to the emerging discipline of reconfigurable computing.
Abstract: The paper gives a brief survey over a decade of R&D on coarse grain reconfigurable hardware and related compilation techniques and points out its significance to the emerging discipline of reconfigurable computing.

121 citations


Proceedings ArticleDOI
30 Jan 2001
TL;DR: The paper gives a brief survey over a decade of R&D on coarse grain reconfigured hardware and related compilation techniques and points out its significance to the emerging discipline of reconfigurable computing.
Abstract: The paper gives a brief survey over a decade of R&D on coarse grain reconfigurable hardware and related compilation techniques and points out its significance to the emerging discipline of reconfigurable computing.

107 citations


Journal ArticleDOI
TL;DR: Simulations show that the optimized BBNN can solve engineering problems such as pattern classification and mobile robot control.
Abstract: This paper presents a novel block-based neural network (BBNN) model and the optimization of its structure and weights based on a genetic algorithm. The architecture of the BBNN consists of a 2D array of fundamental blocks with four variable input/output nodes and connection weights. Each block can have one of four different internal configurations depending on the structure settings, The BBNN model includes some restrictions such as 2D array and integer weights in order to allow easier implementation with reconfigurable hardware such as field programmable logic arrays (FPGA). The structure and weights of the BBNN are encoded with bit strings which correspond to the configuration bits of FPGA. The configuration bits are optimized globally using a genetic algorithm with 2D encoding and modified genetic operators. Simulations show that the optimized BBNN can solve engineering problems such as pattern classification and mobile robot control.

96 citations


Book
30 Jun 2001
TL;DR: The book addresses the energy consumption of Field-Programmable Gate Arrays (FPGAs) and how the programmability of the FPGA can be used to customize implementations of functions on an application basis.
Abstract: The book addresses the energy consumption of Field-Programmable Gate Arrays (FPGAs). FPGAs are becoming popular as embedded components in computing platforms. The programmability of the FPGA can be used to customize implementations of functions on an application basis. This leads to performance gains, and enables reuse of expensive silicon.

Journal ArticleDOI
TL;DR: This article proposes a strategy to automate the design process which considers all possible optimizations that can be carried out at compilation time, regarding context and data transfers, as well as the context management and scheduling optimizations.
Abstract: Dynamically reconfigurable architectures are emerging as a viable design alternative to implement a wide range of computationally intensive applications. At the same time, an urgent necessity has arisen for support tool development to automate the design process and achieve optimal exploitation of the architectural features of the system. Task scheduling and context (configuration) management become very critical issues in achieving the high performance that digital signal processing (DSP) and multimedia applications demand. This article proposes a strategy to automate the design process which considers all possible optimizations that can be carried out at compilation time, regarding context and data transfers. This strategy is general in nature and could be applied to different reconfigurable systems. We also discuss the key aspects of the scheduling problem in a reconfigurable architecture such as MorphoSys. In particular, we focus on a task scheduling methodology for DSP and multimedia applications, as well as the context management and scheduling optimizations.

Journal ArticleDOI
TL;DR: A genetic algorithm (GA) by implementing it in a reconfigurable field programmable gate array (FPGA) is described, which achieves a net child chromosome generation rate of one per clock cycle by pipelining the parent selection, crossover, mutation, and fitness evaluation functions.
Abstract: Accelerating a genetic algorithm (GA) by implementing it in a reconfigurable field programmable gate array (FPGA) is described. The implemented GA features: random parent selection, which conserves selection circuitry; a steady-state memory model, which conserves chip area; survival of fitter child chromosomes over their less-fit parent chromosomes, which promotes evolution. A net child chromosome generation rate of one per clock cycle is obtained by pipelining the parent selection, crossover, mutation, and fitness evaluation functions. Complex fitness functions can be further pipelined to maintain a high-speed clock cycle. Fitness functions with a pipeline initiation interval of greater than one can be plurally implemented to maintain a net evaluated-chromosome throughput of one per clock cycle. Two prototypes are described: The first prototype (c. 1996 technology) is a multiple-FPGA chip implementation, running at a 1 MHz clock rate, that solves a 94-row × 520-column set covering problem 2,200× faster than a 100 MHz workstation running the same algorithm in C. The second prototype (Xilinx XVC300) is a single-FPGA chip implementation, running at a 66 MHZ clock rate, that solves a 36-residue protein folding problem in a 2-d lattice 320× faster than a 366 MHz Pentium II. The current largest FPGA (Xilinx XCV3200E) has circuitry available for the implementation of 30 fitness function units which would yield an acceleration of 9,600× for the 36-residue protein folding problem.

Proceedings ArticleDOI
29 Apr 2001
TL;DR: Tests on the prototype with benchmark examples show that it is a feasible and that fragmentation of the area of the FPGA among many users is manageable.
Abstract: An operating system (OS) for reconfigurable computing uses new versions of algorithms for the allocation of area to tasks, the partitioning of an application to fit selected allocated areas and the placement and routing inside partitions. The algorithms have small deterministic and bounded run times with near linear time complexity making them suitable to run in between time slices or at the initial loading stage of applications. Tests on the prototype with benchmark examples show that it is a feasible and that fragmentation of the area of the FPGA among many users is manageable.

Proceedings ArticleDOI
29 Apr 2001
TL;DR: Results from implementation on a reconfigurable computing platform show that significant logic usage savings and increased clock rates can be obtained by customizing the datapath precision to the algorithm according to the techniques described in this paper.
Abstract: This paper presents a paradigm for the design of multiple wordlength parallel processing systems for DSP applications based on varying the wordlength and scaling of each signal in a DSP block diagram. A technique for estimating the observable effects of truncation and roundoff error is illustrated, and used to form the basis of an optimization algorithm to automate the design of such multiple wordlength systems. Results from implementation on a reconfigurable computing platform show that significant logic usage savings and increased clock rates can be obtained by customizing the datapath precision to the algorithm according to the techniques described in this paper. On selected DSP benchmarks, we obtain up to 45% area reduction and up to 39% speed increase over standard design techniques.

Proceedings ArticleDOI
29 Apr 2001
TL;DR: A reconfigurable computing development environment called Pilchard, employing a field programmable gate array (FPGA) which plugs into a standard personal computer's (PC) 133 MHz synchronous dynamic RAM Dual In-line Memory Modules (DIMMs) slot is presented.
Abstract: A reconfigurable computing development environment called Pilchard, employing a field programmable gate array (FPGA) which plugs into a standard personal computer's (PC) 133 MHz synchronous dynamic RAM Dual In-line Memory Modules (DIMMs) slot is presented. Compared with a traditional PCI interfaced reconfigurable computing board, the DIMM interface offers higher bandwidth, a simpler interface and lower latency. A comparison of the transfer rate of the Pilchard board compared with a standard PC132 reconfigurable computing board is presented as well as an implementation of the data encryption standard (DES). Together, the board and interface generator provide an easy to use, low cost and high performance platform for reconfigurable computing.

Proceedings ArticleDOI
19 Apr 2001
TL;DR: The hardware structure and application of a coarse-grained dynamically reconfigured hardware architecture dedicated to wireless communication systems and a motivation for choosing the concept of distributed arithmetic in reconfigurable computing is provided.
Abstract: This paper presents the hardware structure and application of a coarse-grained dynamically reconfigurable hardware architecture dedicated to wireless communication systems. The application tailored architecture, called DReAM (D_ynamically R_econfigurable Hardware A_rchitecture for M_obile Communication Systems), is a research project at the Darmstadt University of Technology. It covers the complete design process from analyzing the requirements for the dedicated application field, the specification and VHDL implementation of the architecture, up to the physical layout for the final chip. In the following we provide an overview of the major design stages, starting with a motivation for choosing the concept of distributed arithmetic in reconfigurable computing.

Proceedings ArticleDOI
16 Nov 2001
TL;DR: This paper describes a compiler framework to analyze SA-C programs, perform optimizations, and map the application onto the Morphosys architecture, a reconfigurable system-on-chip architecture that supports a data-parallel, SIMD computational model.
Abstract: The rapid growth of silicon densities has made it feasible to deploy reconfigurable hardware as a highly parallel computing platform. However, in most cases, the application needs to be programmed in hardware description or assembly languages, whereas most application programmers are familiar with the algorithmic programming paradigm. SA-C has been proposed as an expression-oriented language designed to implicitly express data parallel operations. Morphosys is a reconfigurable system-on-chip architecture that supports a data-parallel, SIMD computational model. This paper describes a compiler framework to analyze SA-C programs, perform optimizations, and map the application onto the Morphosys architecture. The mapping process involves operation scheduling, resource allocation and binding and register allocation in the context of the Morphosys architecture. The execution times of some compiled image-processing kernels can achieve up to 42x speed-up over an 800 MHz Pentium III machine.

Proceedings ArticleDOI
29 Apr 2001
TL;DR: The Totem custom reconfigurable array generator is the initial step in this direction, generating a custom array for a given computation domain and exploring the design space between an ASIC and an FPGA.
Abstract: Reconfigurable hardware has been shown to provide an efficient compromise between the flexibility of software and the performance of hardware. However, even coarse-grained reconfigurable architectures target the general case, and miss optimization opportunities present if characteristics of the desired application set are known. We can therefore increase efficiency by restricting the structure to support a class or a specific set of algorithms, while still providing flexibility within that set. By generating a custom array for a given computation domain, we explore the design space between an ASIC and an FPGA. However, the manual creation of these customized reprogrammable architectures would be a labor-intensive process, leading to high design costs. Instead, we propose automatic reconfigurable architecture generation specialized to given application sets. The Totem custom reconfigurable array generator is our initial step in this direction.

Proceedings ArticleDOI
29 Apr 2001
TL;DR: Experimental and analytical results show that the two column-based precompiled configuration techniques for tolerating permanent faults in FPGA-based systems achieve significant dependability improvement with small configuration storage overhead.
Abstract: The abundance of configurable logic elements and routing resources in recent Field-Programmable Gate Arrays (FPGAs) provides a cost-effective method for tolerating permanent faults in the system. Once a permanent fault occurs, the FPGA can be reconfigured by replacing the faulty part with previously unused resources in the same hardware. In this paper, we present two column-based precompiled configuration techniques for tolerating permanent faults in FPGA-based systems. By compiling alternative configuration versions in the design phase, these approaches ensure fast reconfiguration, and thus a tremendous increase in system availability. In addition, intentional similarities are created among different configuration versions so that the storage overhead due to precompiled configurations is reduced by orders of magnitude through differential coding and run-length coding. Experimental and analytical results show that our approaches achieve significant dependability improvement with small configuration storage overhead.

Proceedings ArticleDOI
30 Oct 2001
TL;DR: An accurate static timing analyzer is used to identify critical paths and built-in self-test (BIST) hardware is inserted using a placement and routing tool to accomplish testing with low test application time.
Abstract: The widespread use of field programmable gate arrays (FPGAs) as components in high-performance systems has increased the significance of path delay faults in FPGAs. We present a technique for FPGA path delay fault detection which integrates test insertion with the FPGA placement and routing stages to accomplish testing with low test application time. An accurate static timing analyzer is used to identify critical paths and built-in self-test (BIST) hardware is inserted using a placement and routing tool. Initial experimental results show that testing is accomplished with low test application time for several benchmark designs.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: A high-level power modeling technique to estimate the power consumption of reconfigurable devices such as complex programmable logic devices (CPLDs) and field programmable gate arrays (FPGAs) and an adaptive regression method to model the FPGA power consumption.
Abstract: We present a high-level power modeling technique to estimate the power consumption of reconfigurable devices such as complex programmable logic devices (CPLDs) and field programmable gate arrays (FPGAs). For simplicity of reference, we simply refer to these devices as FPGAs. First, we capture the relationship between FPGA power dissipation and I/O signal statistics. We then use an adaptive regression method to model the FPGA power consumption. Such a high-level model can be used in the inner loop of a system-level synthesis tool to estimate the power consumed by different FPGA resources for different potential system-level synthesis solutions. It can also be used to verify the power budgets during embedded system design. With our high-level power model, the FPGA power consumption can be obtained very quickly. Experimental results indicate that the average relative error is only 3.1 % compared to low-level FPGA power simulation methods.

01 Jan 2001
TL;DR: A realization of multiple-output logic functions using a RAM and a sequencer is presented, where a multiple- Output function is represented by an encoded characteristic function for non-zeros (ECFN) and a cascade of look-up tables (LUTs) is represented.
Abstract: A realization of multiple-output logic functions using a RAM and a sequencer is presented. First, a multiple-output function is represented by an encoded characteristic function for non-zeros (ECFN). Then, it is represented by a cascade of look-up tables (LUTs). And finally, the cascade is simulated by a RAM and a sequencer. Multiple-output functions for benchmark functions are realized by cascades of LUTs, and the number of LUTs and levels of cascades are shown. A partition method of outputs for parallel evaluation is also presented. A prototype has been developed by using RAM and FPGA.

Proceedings Article
Leong, Cheung, Tung, Kwok, Wong, Lee 
01 Jan 2001

Book ChapterDOI
01 Oct 2001
TL;DR: The results of the first phase of a project aimed at implementing a full suite of IPSec cryptographic transformations in reconfigurable hardware show that this gigabit-rate, double-algorithm, encryption/ decryption circuit will fit in one Virtex 1000 FPGA taking approximately 80% of the area.
Abstract: In this paper, we present the results of the first phase of a project aimed at implementing a full suite of IPSec cryptographic transformations in reconfigurable hardware. Full implementations of the new Advanced Encryption Standard, Rijndael, and the older American federal standard, Triple DES, were developed and experimentally tested using the SLAAC-1V FPGA accelerator board, based on Xilinx Virtex 1000 devices. The experimental clock frequencies were equal to 91 MHz for Triple DES, and 52 MHz for Rijndael. This translates to the throughputs of 116 Mbit/s for Triple DES, and 577, 488, and 423 Mbit/s for Rijndael with 128-, 192-, and 256-bit keys respectively. We also demonstrate a capability to enhance our circuit to handle the encryption and decryption throughputs of over 1 Gbit/s regardless of the chosen algorithm. Our estimates show that this gigabit-rate, double-algorithm, encryption/ decryption circuit will fit in one Virtex 1000 FPGA taking approximately 80% of the area.

Journal ArticleDOI
TL;DR: A breakthrough is achieved in solving problems to optimality by using the new notion of packing classes, which allows a significant reduction of the search space such that problems of the above type may be solved exactly using a special branch-and-bound technique.
Abstract: Recent generations of Field Programmable Gate Arrays (FPGA) allow the dynamic reconfiguration of cells on the chip during run-time. For a given problem consisting of a set of tasks with computation requirements modeled by rectangles of cells, several optimization problems such as finding the array of minimal size to accomplish the tasks within a given time limit are considered. Existing approaches based on ILP formulations to solve these problems as multi-dimensional packing problems turn out not to be applicable for problem sizes of interest. Here, a breakthrough is achieved in solving these problems to optimality by using the new notion of packing classes. It allows a significant reduction of the search space such that problems of the above type may be solved exactly using a special branch-and-bound technique. We validate the usefulness of our method by providing computational results.

Proceedings ArticleDOI
M. Renovell, P. Faure, Jean-Michel Portal, J. Figueras1, Yervant Zorian 
30 Oct 2001
TL;DR: It is demonstrated that the implicit-scan concept allows 'over-scan' of sequential circuits resulting in highly testable circuits and is transparent for the user as well as for the FPGA mapping tools.
Abstract: Proposes a new and original FPGA architecture with testability facilities. It is first demonstrated that classical FPGA architectures do not allow one to efficiently implement sequential circuits with a scan chain. It is consequently proposed to modify the architecture of classical FPGAs in order to create an implicit-scan chain into the FPGA itself called implicit scan FPGA (IS-FPGA). Using this new FPGA architecture, any sequential circuit implemented into the FPGA is 'implicitly scanned'. An original and optimal implementation of the proposed architecture is given with minimum area overhead and absolutely no delay impact. Additionally the technique is transparent for the user as well as for the FPGA mapping tools. Finally, it is demonstrated that the implicit-scan concept allows 'over-scan' of sequential circuits resulting in highly testable circuits.

Journal ArticleDOI
TL;DR: A design environment is discussed that exploits three FPGA-specific advantages to create a unified simulation/execution debug environment implemented in the JHDL design system.
Abstract: Field programmable gate array (FPGA)-based systems provide advantages over conventional hardware including: (1) availability of the hardware during design and debug; (2) programmability; and (3) visibility. These three advantages can greatly shorten the design and verification cycle. This paper discusses a design environment that exploits these three FPGA-specific advantages to create a unified simulation/execution debug environment implemented in the JHDL design system. The described system provides a hardware debugging environment with the functionality of a simulator but up to 10000/spl times/ faster. In addition, testbenches and other typical verification software used in simulators can be used to verify running hardware.

Journal ArticleDOI
01 May 2001
TL;DR: Two contributions to the theory and practice of using reconfigurable hardware to implement search engines based on hashing techniques are reported and a quantitative framework is developed for estimating design trade-offs, such as the amount of temporary storage versus reconfiguration time.
Abstract: This paper reports two contributions to the theory and practice of using reconfigurable hardware to implement search engines based on hashing techniques. The first contribution concerns technology-independent optimisations involving run-time reconfiguration of the hash functions; a quantitative framework is developed for estimating design trade-offs, such as the amount of temporary storage versus reconfiguration time. The second contribution concerns methods for optimising implementations in Xilinx FPGA technology, which achieve different trade-offs in cell utilisation, reconfiguration time and critical path delay; quantitative analysis of these trade-offs are provided.

Proceedings ArticleDOI
03 Jan 2001
TL;DR: The results indicate that FPGA hardware can be generated automatically reducing the design time from days to minutes, with the tradeoff that the automatically generated hardware is 5 times slower than the manually designed hardware.
Abstract: Field Programmable Gate Arrays (FPGAs) have been recently used as an effective platform for implementing many image/signal processing applications. MATLAB is one of the most popular languages to model image/signal processing applications. We present the MATCH compiler that takes MATLAB as input and produces a hardware in RTL VHDL, which can be mapped to an FPGA using commercial CAD tools. This dramatically reduces the time to implement an application on an FPGA. We present results on some image and signal processing algorithms for which hardware was synthesized using our compiler for the Xilinx XC4028 FPGA with an external memory. We also present comparisons with manually designed hardware for the applications. Our results indicate that FPGA hardware can be generated automatically reducing the design time from days to minutes, with the tradeoff that the automatically generated hardware is 5 times slower than the manually designed hardware.