scispace - formally typeset
Search or ask a question

Showing papers on "Reconfigurable computing published in 1997"


Proceedings ArticleDOI
16 Apr 1997
TL;DR: Novel aspects of the Garp Architecture are presented, as well as a prototype software environment and preliminary performance results, which suggest that a Garp of similar technology could achieve speedups ranging from a factor of 2 to as high as a factors of 24 for some useful applications.
Abstract: Typical reconfigurable machines exhibit shortcomings that make them less than ideal for general-purpose computing. The Garp Architecture combines reconfigurable hardware with a standard MIPS processor on the same die to retain the better features of both. Novel aspects of the architecture are presented, as well as a prototype software environment and preliminary performance results. Compared to an UltraSPARC, a Garp of similar technology could achieve speedups ranging from a factor of 2 to as high as a factor of 24 for some useful applications.

1,030 citations


Proceedings ArticleDOI
16 Apr 1997
TL;DR: The architecture of a time-multiplexed FPGA is described, which includes extensions for dealing with state saving and forwarding and for increased routing demand due to time- multiplexing the hardware.
Abstract: This paper describes the architecture of a time-multiplexed FPGA. Eight configurations of the FPGA are stored in on-chip memory. This inactive on-chip memory is distributed around the chip, and accessible so that the entire configuration of the FPGA can be changed in a single cycle of the memory. The entire configuration of the FPGA can be loaded from this on-chip memory in 30 ns. Inactive memory is accessible as block RAM for applications. The FPGA is based on the Xilinx XC4000E FPGA, and includes extensions for dealing with state saving and forwarding and for increased routing demand due to time-multiplexing the hardware.

533 citations


Proceedings ArticleDOI
16 Apr 1997
TL;DR: Chimaera is described, a system that overcomes the communication bottleneck by integrating reconfigurable logic into the host processor itself and enables the creation of multi-operand instructions and a speculative execution model key to high-performance, general-purpose reconfiguring computing.
Abstract: By strictly separating reconfigurable logic from their host processor, current custom computing systems suffer from a significant communication bottleneck. In this paper we describe Chimaera, a system that overcomes this bottleneck by integrating reconfigurable logic into the host processor itself with direct access to the host processor's register file, the system enables the creation of multi-operand instruction and a speculative execution model key to high performance, general-purpose reconfigurable computing. It also supports multi-output functions, and utilizes partial run-time reconfiguration to reduce reconfiguration time. Combined, this system can provide speedups of a factor of two or more for general-purpose computing, and speedups of 160 or more are possible for hand-mapped applications.

450 citations


Patent
09 Apr 1997
TL;DR: In this article, a compiling system and method for generating a sequence of program instructions for use in a dynamically reconfigurable processing unit having an internal hardware organization that is selectively changeable among a plurality of hardware architectures, each hardware architecture executing instructions from a corresponding instruction set.
Abstract: A compiling system and method for generating a sequence of program instructions for use in a dynamically reconfigurable processing unit having an internal hardware organization that is selectively changeable among a plurality of hardware architectures, each hardware architecture executing instructions from a corresponding instruction set. Source files are compiled for execution using various instruction set architectures as specified by reconfiguration directives. Object files optionally encapsulate bitstreams specifying hardware architectures corresponding to instruction set architectures with executable code for execution on the architectures.

142 citations


Patent
16 Oct 1997
TL;DR: In this paper, a method and apparatus for combining emulation and simulation of a logic design is presented, where simulation is performed by one or more microprocessors while emulation is performed in reconfigurable hardware such as field programmable gate arrays.
Abstract: A method and apparatus for combining emulation and simulation of a logic design. The method and apparatus can be used with a logic design that includes gate-level descriptions, behavioral representations, structural representations, or a combination thereof. The emulation and simulation portions are combined in a manner that minimizes the time for transferring data between the two portions. Simulation is performed by one or more microprocessors while emulation is performed in reconfigurable hardware such as field programmable gate arrays. When multiple microprocessors are employed, independent portions of the logic design are selected to be executed on the multiple synchronized microprocessors. Reconfigurable hardware also performs event detecting and scheduling operations to aid the simulation, and to reduce processing time.

141 citations


Proceedings ArticleDOI
16 Apr 1997
TL;DR: The RAW benchmark suite consists of twelve programs designed to facilitate comparing, validating, and improving reconfigurable computing systems, and includes an architecture-independent compilation framework, Raw Computation Structures (RawCS), to express each algorithm's dependencies and to support automatic synthesis, partitioning, and mapping to a reconfigured computer.
Abstract: The RAW benchmark suite consists of twelve programs designed to facilitate comparing, validating, and improving reconfigurable computing systems. These benchmarks run the gamut of algorithms found in general purpose computing, including sorting, matrix operations, and graph algorithms. The suite includes an architecture-independent compilation framework, Raw Computation Structures (RawCS), to express each algorithm's dependencies and to support automatic synthesis, partitioning, and mapping to a reconfigurable computer. Within this framework, each benchmark is portably designed in both C and Behavioral Verilog and scalably parameterized to consume a range of hardware resource capacities. To establish initial benchmark ratings, we have targeted a commercial logic emulation system based on virtual wires technology to automatically generate designs up to millions of gates (14 to 379 FPGAs). Because the virtual wires techniques abstract away machine-level details like FPGA capacity and interconnect, our hardware target for this system is an abstract reconfigurable logic fabric with memory-mapped host I/O. We report initial speeds in the range of 2X to 1800X faster than a 2.82 SPECint95 SparcStation 20 and encourage others in the field to run these benchmarks on other systems to provide a standard comparison.

122 citations


Patent
20 Jun 1997
TL;DR: In this paper, a dynamically reconfigurable hardware system provides real-time control of an external device by using a configuration memory (19), a configuration logic module (13), and a processor (17).
Abstract: A dynamically reconfigurable hardware system provides real-time control of an external device. The system has a configuration memory (19), a reconfigurable logic module (13) and a processor (17), and a communication port (31). The configuration memory (19) stores a hardware configuration of the reconfigurable logic module (13). The reconfigurable logic module (13) in communication with the configuration memory (19), establishes the hardware configuration in the module (13) on receipt of a configuration signal. The processor (17) is in communication over a data bus (27) with the reconfigurable logic module (13), the configuration memory (19), and sends the configuration signal to the reconfigurable logic module (13) and establishes the hardware configuration in the configuration memory (19). The communication port (31), coupled to the reconfigurable logic module (13) independently of signals on the data bus (27), controls the external device.

113 citations


01 Jan 1997
TL;DR: The FPGA approach to digital filter implementation includes higher sampling rates than are available from traditional DSP chips, lower costs than an ASIC for moderate volume applications, and more flexibility than the alternate approaches.
Abstract: Digital filtering algorithms are most commonly implemented using general purpose digital signal processing chips for audio applications, or special purpose digital filtering chips and application-specific integrated circuits (ASICs) for higher rates. This paper describes an approach to the implementation of digital filter algorithms based on field programmable gate arrays (FPGAs). The advantages of the FPGA approach to digital filter implementation include higher sampling rates than are available from traditional DSP chips, lower costs than an ASIC for moderate volume applications, and more flexibility than the alternate approaches. Since many current FPGA architectures are in-system programmable, the configuration of the device may be changed to implement different functionality if required. Our examples illustrate that the FPGA approach is both flexible and provides performance comparable or superior to traditional approaches.

103 citations


Patent
21 May 1997
TL;DR: In this paper, the FPGA is used as a computational engine to provide direct hardware support for flexible fault tolerance between unconstrained combinations of the computing modules in a preferred embodiment, where computing modules couple with dual-ported memories and interface with a dynamically reconfigurable Field-Programmable Gate Array.
Abstract: Computing modules can cooperate to tolerate faults among their members. In a preferred embodiment, computing modules couple with dual-ported memories and interface with a dynamically reconfigurable Field-Programmable Gate Array ("FPGA"). The FPGA serves as a computational engine to provide direct hardware support for flexible fault tolerance between unconstrained combinations of the computing modules. In addition to supporting traditional fault tolerance functions that require bit-for-bit exactness, the FPGA engine is programmed to tolerate faults that cannot be detected through direct comparison of module outputs. Combating these faults requires more complex algorithmic or heuristic approaches that check whether outputs meet user-defined reasonableness criteria. For example, forming a majority from outputs that are not identical but may nonetheless be correct requires taking an inexact vote. The FPGA engine's flexibility extends to allowing for multiprocessing among the modules where the FPGA engine supports message passing. Implementing these functions in hardware instead of software makes them execute faster. The FPGA is reprogrammable, and only the functions required immediately need be implemented. Inactive functions are stored externally in a Read-Only Memory (ROM). The dynamically reconfigurable FPGA gives the fault-tolerant system an output stage that offers low gate complexity by storing the unused "gates" as configuration code in ROM. Lower gate complexity translates to a highly reliable output stage, prerequisite to a fault tolerant system.

102 citations


Proceedings Article
04 Jan 1997
TL;DR: The research area is introduced, a classification of reconfigurable architectures is presented, various implementation options are presented, and some of the open research problems related to reconfiguring computing are described.
Abstract: Reconfigurable computing is a new and emerging field that makes use of programmable devices to construct "custom computing machinery". In this paper, we introduce the research area, present a classification of reconfigurable architectures, present various implementation options, and describe some of the open research problems related to reconfigurable computing.

99 citations


Book ChapterDOI
01 Sep 1997
TL;DR: A new approach to implement satisfiability (SAT) on reconfigurable hardware that relies on fine-grain massive parallelism and a major novel feature is that both the next variable to assign and its value are dynamically determined by a backward model traversal done in hardware.
Abstract: In this paper we present a new approach to implement satisfiability (SAT) on reconfigurable hardware. Given a combinational circuit C, we automatically design a SAT circuit whose architecture implements a branch-and-bound SAT algorithm specialized for C. A major novel feature is that both the next variable to assign and its value are dynamically determined by a backward model traversal done in hardware. Our approach relies on fine-grain massive parallelism.

Proceedings ArticleDOI
08 Oct 1997
TL;DR: It is illustrated, that the current main stream approach based on placement and routing is not very likely to obtain the area-efficiency and throughput needed to cope with the emerging crisis cost of future silicon technology generations.
Abstract: The paper is a proposal for a radical methodological change in R&D of dynamically reconfigurable circuits. The paper illustrates, that the current main stream approach based on placement and routing is not very likely to obtain the area-efficiency and throughput needed to cope with the emerging crisis cost of future silicon technology generations. The proposed changes include both: architectural principles and fundamental issues in application development support environments. The paper illustrates the feasibility of general purpose programmable accelerators and their commercialization. The paper highlights computer systems' increasing dependency on add-on accelerators. It shows, why only by a new methodology reconfigurable hardware will overcome its role as a niche technology and become competitive to ASICs and other hardwired accelerators. It illustrates the possible coming crisis of ASIC design based on wasting chip area by placement and routing and discusses the vision of software-only implementation of accelerators.

Proceedings ArticleDOI
09 Feb 1997
TL;DR: The results of this design work provide important clues as to how to improve FPGA architectures to better support real-time signal processing at hundreds of MHz and indicate that clock buffering is frequently the cause of ultimate failure in speed and performance tests.
Abstract: This paper describes an application in high-performance signal processing using reconfigurable computing engines. The application is a 250 MHz cross-correlator for radio astronomy and was developed using the fastest available Xilinx FPGA's. We will report experimental results on the operation of CMOS FPGA's at 250 MHz, and describe the architectural innovations required to build a 250 MHz reconfigurable signal processor. Extensions of the technique to a variety of high-performance real-time signal processing algorithms are discussed. The results of this design work provide important clues as to how to improve FPGA architectures to better support real-time signal processing at hundreds of MHz. In particular, direct routing resources between logic elements are critical to preserving high performance. These routing resources need to be symmetric in order to allow for two-way communications between logic elements. Four-way symmetry and regularity would allow for orthogonal transformations of processing elements in a hierarchical fashion. Finally, experimental results indicate that clock buffering is frequently the cause of ultimate failure in speed and performance tests. Wave pipelining techniques may be suitable in clock distribution to improve performance to match that of other elements in the system.

Book ChapterDOI
01 Sep 1997
TL;DR: The Java Environment for Reconfigurable Computing (JERC) is a software environment for reconfigurable coprocessor applications that consists of only a standard Java compiler and a set of libraries.
Abstract: The Java Environment for Reconfigurable Computing (JERC) is a software environment for reconfigurable coprocessor applications. This environment consisting of only a standard Java compiler and a set of libraries. Using JERC, configuration, reconfiguration and host runtime operation is supported. JERC also features design compile times on the order seconds and built-in support for parameterized macros.

Proceedings ArticleDOI
07 Apr 1997
TL;DR: Strategies for designing hardware and software simulation entities are introduced to reduce the impact of size and performance constraints imposed by the cosimulation environment while addressing the issues of time management and synchronization.
Abstract: A heterogeneous environment for hardware/software cosimulation is described. This environment permits a portion of an application's subsystems to be simulated using reconfigurable hardware while the remainder of the subsystems are simulated using software. An Aptix FPCB populated with Xilinx FPGAs serves as the hardware simulation platform while an IBM-compatible PC serves as the software simulation platform. The two platforms are connected using an Altera reconfigurable logic board which allows the development of a high-speed interface for communication. This paper focuses on the difficulties associated with designing and interfacing simulation entities in this heterogeneous environment. Strategies for designing hardware and software simulation entities are introduced. These strategies reduce the impact of size and performance constraints imposed by the cosimulation environment while addressing the issues of time management and synchronization. A simple queueing application is used to illustrate a design methodology which incorporates these design strategies.

Proceedings ArticleDOI
16 Apr 1997
TL;DR: A systematic comparison of two promising arithmetic architecture classes based on standard base representation and composite fields found that composite field multipliers can be more than twice as fast as polynomial base multipliers on FPGAs and EPLD devices.
Abstract: Reed-Solomon (RS) error correction codes are being widely used in modern communication systems such as compact disk players or satellite communication links. RS codes rely on arithmetic in finite, or Galois fields. The specific field GF(2/sup 8/) is of central importance for many practical systems. The most costly, and thus most critical, elementary operations in RS decoders are multiplication and inversion in Galois fields. Although there have been considerable efforts in the area of Galois field arithmetic architectures, there appears to be very little reported work for Galois field arithmetic for reconfigurable hardware. This contribution provides a systematic comparison of two promising arithmetic architecture classes. The first one is based on a standard base representation, and the second one is based on composite fields. For both classes a multiplier and an inverter for GF(2/sup 8/) are described and theoretical gate counts are provided. Using a design entry based on a VHDL description, each architecture is mapped to a popular FPGA and EPLD device. For each mapping an area and a speed optimization was performed. Absolute values with respect to logic cell counts and critical path simulations are provided. The results show that the composite field architectures can have great advantages on both types of reconfigurable platforms. In particular it is found that composite field multipliers can be more than twice as fast as polynomial base multipliers on FPGAs.

Book ChapterDOI
01 Sep 1997
TL;DR: A maximum matching based algorithm is presented to reconfigure the placement on an FPGA with little or no impact on circuit performance and results indicate the algorithm works well for both fault tolerance and reconfigurable computing applications.
Abstract: Field-programmable gate arrays have the potential to provide reconfigurability in the presence of faults. In this paper, we have investigated the problem of partially reconfiguring FPGA mapped designs. We present a maximum matching based algorithm to reconfigure the placement on an FPGA with little or no impact on circuit performance. Experimental results indicate the algorithm works well for both fault tolerance and reconfigurable computing applications. We also present the motivation and feasibility of using a similar approach for dynamic circuit reconfigurability.

Proceedings ArticleDOI
16 Apr 1997
TL;DR: The authors present an integrated tool set to generate highly optimized hardware computation blocks from a C language subset, specifically targeted to fine grained FPGAs such as the National Semiconductor CLAy/sup TM/ FPGA family.
Abstract: The authors present an integrated tool set to generate highly optimized hardware computation blocks from a C language subset. By starting with a C language description of the algorithm, they address the problem of making FPGA processors accessible to programmers as opposed to hardware designers. Their work is specifically targeted to fine grained FPGAs such as the National Semiconductor CLAy/sup TM/ FPGA family. Such FPGAs exhibit extremely high performance on regular data path circuits, which are more prevalent in computationally oriented hardware applications. Dense packing of data path functional elements makes it possible to fit the computation on one or a small number of chips, and the use of local routing resources makes it possible to clock the chip at a high rate. By developing a lower level tool suite that exploits the regular, geometric nature of fine grained FPGAs, and mapping the compiler output to this tool suite, they greatly improve performance over traditional high level synthesis to fine grained FPGAs.

01 Jan 1997
TL;DR: The MATRIX chip represents a novel, reconfigurable computing architecture which supports configurable instruction distribution, and a coarse-grained building block which can serve as an instruction store, a memory element, or a computational element.
Abstract: †. Andre DeHon Soda Hall #1776 may now be contacted at: U.C. Berkeley Berkeley, CA 94720-1776 The MATRIX chip represents a novel, reconfigurable computing architecture which supports configurable instruction distribution. Device resources are allocated to controlling and describing the computation on a pertask basis. Application-specific regularity and parallelism allows us to compress the resources allocated to instruction control and distribution, in many situations yielding more resources for datapaths and computations. This flexibility is made possible by a multi-level configuration scheme, a unified configurable network supporting both datapath and instruction distribution, and a coarse-grained building block which can serve as an instruction store, a memory element, or a computational element.

Proceedings ArticleDOI
13 Nov 1997
TL;DR: This paper presents a method for the testing and diagnosis of faults in FPGAs that imposes no hardware overhead, and requires minimal support from external test equipments.
Abstract: Since Field programmable gate arrays (FPGAs) are reprogrammable, faults in them can be easily tolerated once fault sites are located. In this paper we present a method for the testing and diagnosis of faults in FPGAs. The proposed method imposes no hardware overhead, and requires minimal support from external test equipments. Test time depends only on the number of faults, and is independent of the chip size. With the help of this technique, chips with faults can still be used. As a result, the chip yield can be improved and chip cost is reduced. Experimental results are given to show the feasibility of this method.

Proceedings ArticleDOI
11 Dec 1997
TL;DR: This paper describes the Armstrong III architecture and concludes with a substantive example application that performs HMM Training for speech recognition with the reconfigurable platform.
Abstract: Armstrong III is a multi node multi-computer designed and built at the Laboratory for Engineering Man/Machine System (LEMS) of Brown University. Each node contains a RISC processor and reconfigurable resources implemented with FPGAs. The primary benefit in using FPGAs is that the resulting hardware is neither rigid nor permanent but is in-circuit reprogrammable. This allows each node to be tailored to the computational requirements of an application. This paper describes the Armstrong III architecture and concludes with a substantive example application that performs HMM Training for speech recognition with the reconfigurable platform.

Proceedings ArticleDOI
16 Apr 1997
TL;DR: A hardware accelerator is presented which exploits the fine-grain parallelism in routing individual nets and accelerates routing of FPGAs by 10 fold with a combination of processor clusters and hardware acceleration.
Abstract: The authors describe their experience and progress in accelerating an FPGA router. Placement and routing is undoubtedly the most time-consuming process in automatic chip design or configuring programmable logic devices as reconfigurable computing elements. Their goal is to accelerate routing of FPGAs by 10 fold with a combination of processor clusters and hardware acceleration. Coarse-grain parallelism is exploited by having several processors route separate groups of nets in parallel. A hardware accelerator is presented which exploits the fine-grain parallelism in routing individual nets.

Proceedings ArticleDOI
12 Oct 1997
TL;DR: This paper summarizes the positions of the panelists who presented their views on the new challenges in compilers for future systems-on-silicon, and who debated these issues at ICCD '97.
Abstract: While software is already a significant component in today's system-on-silicon, it is expected to greatly dominate future generations of systems-on-silicon that will contain several (possibly heterogeneous) programmable processors, as well as reconfigurable hardware blocks and large amounts of on-chip memory. In this context, one could argue that the problems for compiler designers haven't changed at all, since the basic issues all boil down to the traditional challenges of: 1) coarse-grain parallelism extraction for the multiple processors on chip, 2) instruction level parallelism exploitation for individual processors, and 3) retargetable code generation for versions of the on-chip processors. This paper summarizes the positions of the panelists who presented their views on the new challenges in compilers for future systems-on-silicon, and who debated these issues at ICCD '97.

Proceedings ArticleDOI
Miron Abramovici1, P. Menon1
16 Apr 1997
TL;DR: A new approach to fault simulation, using reconfigurable hardware to implement a critical path tracing algorithm, shows that the approach is at least on order of magnitude faster than serial fault emulation used in prior work.
Abstract: The authors introduce a new approach to fault simulation, using reconfigurable hardware to implement a critical path tracing algorithm. The performance estimate shows that the approach is at least on order of magnitude faster than serial fault emulation used in prior work.

Proceedings ArticleDOI
26 Nov 1997
TL;DR: A novel paradigm for designing fault-tolerant processor arrays inspired by biological processes is presented and can be applied to the design of fault-Tolerant FPGAs, or for improvement of yield in the fabrication of WSI systems.
Abstract: A novel paradigm for designing fault-tolerant processor arrays inspired by biological processes is presented. These ideas can be applied to the design of fault-tolerant FPGAs, or for improvement of yield in the fabrication of WSI systems. Embryonics is a nascent discipline, therefore much research must be done to investigate in depth the fault-tolerant properties of these systems. Our approach is coherent with a recent uproar about the application of biological concepts to the solution of engineering problems, e.g., evolutionary computing, evolvable hardware, and genetic algorithms. (4 pages)

Journal ArticleDOI
TL;DR: This paper provides three new algorithms for the pin assignment of multi-FPGA systems with arbitrary topologies, and shows that the force-directed approach produces better mappings, in significantly shorter time, than any of the other approaches.
Abstract: Multi-FPGA systems have tremendous potential, providing a high-performance computing substrate for many different applications. One of the keys to achieving this potential is a complete, automatic mapping solution that creates high-quality mappings in the shortest possible time. In this paper, we consider one step in this process, the assignment of inter-FPGA signals to specific I/O pins on the FPGAs in a multi-FPGA system. We show that this problem can neither be handled by pin assignment methods developed for other applications nor standard routing algorithms. Although current mapping systems ignore this issue, we show that an intelligent pin assignment method can achieve both quality and mapping speed improvements over random approaches. Intelligent pin assignment methods already exist for multi-FPGA systems, but are restricted to topologies where logic-bearing FPGAs cannot be directly connected. In this paper, we provide three new algorithms for the pin assignment of multi-FPGA systems with arbitrary topologies. We compare these approaches on several mappings to current multi-FPGA systems, and show that the force-directed approach produces better mappings, in significantly shorter time, than any of the other approaches.

Proceedings ArticleDOI
Nalini K. Ratha1, Anil K. Jain
20 Oct 1997
TL;DR: This paper describes the usage of custom computing approach to meet the computation and communication needs of computer vision algorithms and demonstrates the advantages of this approach using Splash 2-a Xilinx 4010-based custom computer.
Abstract: Algorithms in computer vision are characterized by (i) complex and repetitive operations; (ii) large amount of data and (iii) a variety of data interaction (e.g., point operations, neighborhood operations, global operations). Based on the computation and communication complexity, vision algorithms have been characterized into three categories: (i) low-level, (ii) intermediate-level and (iii) high-level. In this paper, we describe the usage of custom computing approach to meet the computation and communication needs of computer vision algorithms. By customizing hardware architecture for every application at the instruction level, the optimal grain size needed for the problem at hand and the instruction granularity can be matched. Field Programmable Gate Array (FPGA) based processing elements (PEs) are being used to provide this facility. Using programmable communication resources, the diverse communication requirements can be met. A vision system needs to integrate hardware for the three levels. A custom computing approach alleviates the problem of achieving optimal granularity for different stages as the same hardware gets reconfigured at a software level for different levels of the application. We demonstrate the advantages of our approach using Splash 2-a Xilinx 4010-based custom computer.

Book ChapterDOI
29 Oct 1997
TL;DR: An implementation of GSAT on Field Programmable Gate Arrays (FPGAs) in order to speed-up the resolution of SAT problems and to enable real-time resolution for current size problems.
Abstract: GSAT is a greedy local search procedure. It searches for satisfiable instantiations of formulas under conjunctive normal form. Intrinsically incomplete, this algorithm has shown its ability to deal with formulas of large size that are not yet accessible to exhaustive methods. Many problems such as planning, scheduling, vision can efficiently be solved by using the GSAT algorithm. In this study, we give an implementation of GSAT on Field Programmable Gate Arrays (FPGAs) in order to speed-up the resolution of SAT problems. By this implementation, our aim is to solve large SAT problems and to enable real-time resolution for current size problems. The FPGA technology [12] allows users to adapt a generic logic chip to different tasks. In the framework of SAT problems we show how to quickly adapt our chips to efficiently solve satisfiability problems.

Proceedings ArticleDOI
09 May 1997
TL;DR: This paper provides a report on the implementation of both architectures and also offers a comparison with the hybrid structure, which is becoming increasingly popular for prototyping and designing complex hardware systems.
Abstract: Over the last thirty years, since Zadeh first introduced fuzzy set theory, there has been widespread interest in the real-time application of fuzzy logic, particularly in the area of control. Recently, there has been considerable interest in the development of dedicated hardware implementations which facilitate high speed processing. However, the main drawback of such an approach is that it is only cost effective for high-volume applications. A more feasible methodology for lower volume problems demands the application of general-purpose or programmable hardware such as the Xilinx FPGAs. There has been a similar trend in the area of neural networks, as initial research employed software simulations but more recent interest has investigated hardware implementations. FPGAs are becoming increasingly popular for prototyping and designing complex hardware systems. The structure of an FPGA can be described as an "array of blocks" connected together via programmable interconnections. The main advantage of FPGAs is the flexibility that they afford. An engineer can change and refine a device's design by exploiting the device's reprogrammability. Xilinx introduced the world's first FPGA, the XC2064, in 1985. This contained approximately 1000 logic gates. Since then, the gate density of Xilinx FPGAs has increased 25 times. There has been a lot of recent interest in the FPGA realisation of fuzzy systems. Similarly there are a number of FPGA implementations of neural networks reported in the literature. However. this paper provides a report on the implementation of both architectures and also offers a comparison with the hybrid structure.

Book ChapterDOI
01 Sep 1997
TL;DR: Riley-2, a new platform developed at Imperial College, is shown to meet a number of requirements for an ideal platform for codesign research and an image processing application running on a PC with the Riley-2 and a Quickcam camera is described.
Abstract: The paper proposes a number of requirements for an ideal platform for codesign research. Riley-2, a new platform developed at Imperial College, is shown to meet these requirements. It is a PCI based board consisting mainly of four dynamically reconfigurable FPGAs and an embedded processor. A VHDL model of the Riley-2 system, including all major components except the PCI interface, has been produced. Two design routes, one based on VHDL with parametrised hardware libraries and the other based on a codesign language called Cedar, have been developed for Riley-2. Finally, an image processing application running on a PC with the Riley-2 and a Quickcam camera, is described.