scispace - formally typeset
Search or ask a question

Showing papers on "Reconfigurable computing published in 2005"


Journal ArticleDOI
25 Jul 2005
TL;DR: It is shown that reconfigurable computing designs are capable of achieving up to 500 times speedup and 70% energy savings over microprocessor implementations for specific applications.
Abstract: Reconfigurable computing is becoming increasingly attractive for many applications. This survey covers two aspects of reconfigurable computing: architectures and design methods. The paper includes recent advances in reconfigurable architectures, such as the Alters Stratix II and Xilinx Virtex 4 FPGA devices. The authors identify major trends in general-purpose and special-purpose design methods. It is shown that reconfigurable computing designs are capable of achieving up to 500 times speedup and 70% energy savings over microprocessor implementations for specific applications.

414 citations


Journal ArticleDOI
TL;DR: The Berkeley Emulation Engine 2 (BEE2) project is developing a reusable, modular, and scalable framework for designing high-end reconfigurable computers, including a processing-module building block and several programming models.
Abstract: The Berkeley Emulation Engine 2 (BEE2) project is developing a reusable, modular, and scalable framework for designing high-end reconfigurable computers, including a processing-module building block and several programming models. Using these elements, BEE2 can provide over 10 times more computing throughput than a DSP-based system with similar power consumption and cost and over 100 times that of a microprocessor-based system.

304 citations


Proceedings ArticleDOI
20 Feb 2005
TL;DR: A novel packet classification architecture called BV-TCAM is presented, which is implemented for an FPGA-based Network Intrusion Detection System (NIDS), which can report multiple matches at gigabit per second network link rates.
Abstract: Using FPGA technology for real-time network intrusion detection has gained many research efforts recently. In this paper, a novel packet classification architecture called BV-TCAM is presented, which is implemented for an FPGA-based Network Intrusion Detection System (NIDS). The classifier can report multiple matches at gigabit per second network link rates. The BV-TCAM architecture combines the Ternary Content Addressable Memory (TCAM) and the Bit Vector (BV) algorithm to effectively compress the data representations and boost throughput. A tree-bitmap implementation of the BV algorithm is used for source and destination port lookup while a TCAM performs the lookup of the other header fields, which can be represented as a prefix or exact value. The architecture eliminates the requirement for prefix expansion of port ranges. With the aid of a small embedded TCAM, packet classification can be implemented in a relatively small part of the available logic of an FPGA. The design is prototyped and evaluated in a Xilinx FPGA XCV2000E on the FPX platform. Even with the most difficult set of rules and packet inputs, the circuit is fast enough to sustain OC48 traffic throughput. Using larger and faster FPGAs, the system can work at speeds greater than OC192.

234 citations


Journal ArticleDOI
TL;DR: A detailed and flexible power model which has been integrated in the widely used Versatile Place and Route (VPR) CAD tool is described, which estimates the dynamic, short-circuit, and leakage power consumed by FPGAs.
Abstract: Power has become a critical issue for field-programmable gate array (FPGA) vendors. Understanding the power dissipation within FPGAs is the first step in developing power-efficient architectures and computer-aided design (CAD) tools for FPGAs. This article describes a detailed and flexible power model which has been integrated in the widely used Versatile Place and Route (VPR) CAD tool. This power model estimates the dynamic, short-circuit, and leakage power consumed by FPGAs. It is the first flexible power model developed to evaluate architectural tradeoffs and the efficiency of power-aware CAD tools for a variety of FPGA architectures, and is freely available for noncommercial use. The model is flexible, in that it can estimate the power for a wide variety of FPGA architectures, and it is fast, in that it does not require extensive simulation, meaning it can be used to explore a large architectural space. We show how the model can be used to investigate the impact of various architectural parameters on the energy consumed by the FPGA, focusing on the segment length, switch block topology, lookuptable size, and cluster size.

187 citations


Book
14 Dec 2005
TL;DR: A one-of-a-kind survey of the field of Reconfigurable Computing gives a comprehensive introduction to a discipline that offers a 10X-100X acceleration of algorithms over microprocessors.
Abstract: A one-of-a-kind survey of the field of Reconfigurable Computing Gives a comprehensive introduction to a discipline that offers a 10X-100X acceleration of algorithms over microprocessors Discusses the impact of reconfigurable hardware on a wide range of applications: signal and image processing, network security, bioinformatics, and supercomputing Includes the history of the field as well as recent advances Includes an extensive bibliography of primary sources

178 citations


Book
01 Jan 2005
TL;DR: C-based techniques for building high-performance, FPGA-accelerated software applicationsCircuits, Devices, and Systems C-based Techniques for Optimizing FPGAs Performance, Design Flexibility, and Time to Market forward is introduced.
Abstract: C-based techniques for building high-performance, FPGA-accelerated software applicationsCircuits, Devices, and SystemsC-based Techniques for Optimizing FPGA Performance, Design Flexibility, and Time to MarketForward written by Clive "Max" Maxfield. High-performance FPGA-accelerated software applications are a growing demand in fields ranging from communications and image processing to biomedical and scientific computing. This book introduces powerful, C-based parallel-programming techniques for creating these applications, verifying them, and moving them into FPGA hardware.The authors bridge the chasm between "conventional" software development and the methods and philosophies of FPGA-based digital design. Software engineers will learn to look at FPGAs as "just another programmable computing resource," while achieving phenomenal performance because much of their code is running directly in hardware. Hardware engineers will master techniques that perfectly complement their existing HDL expertise, while allowing them to explore design alternatives and create prototypes far more rapidly. Both groups will learn how to leverage C to support efficient hardware/software co-design and improve compilation, debugging, and testing. Understand when C makes sense in FPGA development and where it fits into your existing processes Leverage C to implement software applications directly onto mixed hardware/software platforms Execute and test the same C algorithms in desktop PC environments and in-system using embedded processors Master new, C-based programming models and techniques optimized for highly parallel FPGA platforms Supercharge performance by optimizing through automated compilation Use multiple-process streaming programming models to deliver truly astonishing performance Preview the future of FPGA computing Study an extensive set of realistic C code examplesAbout the Web SiteVisit http://www.ImpulseC.com/practical to download fully operational, time-limited versions of a C-based FPGA design compiler, as well as additional examples and programming tips. © Copyright Pearson Education. All rights reserved.

176 citations


Patent
18 Jan 2005
TL;DR: In this article, the authors present a system and method for online configuration of a measurement system, where the user can access a server over a network and specify a desired task, and receive programs and/or configuration information which are usable to configure the user's measurement system hardware (and/or software) to perform the desired task.
Abstract: A system and method for online configuration of a measurement system. The user may access a server over a network and specify a desired task, e.g., a measurement task, and receive programs and/or configuration information which are usable to configure the user's measurement system hardware (and/or software) to perform the desired task. Additionally, if the user does not have the hardware required to perform the task, the required hardware may be sent to the user, along with programs and/or configuration information. The hardware may be reconfigurable hardware, such as an FPGA or a processor/memory based device. In one embodiment, the required hardware may be pre-configured to perform the task before being sent to the user. In another embodiment, the system and method may provide a graphical program in response to receiving the user's task specification, where the graphical program may be usable by the measurement system to perform the task.

145 citations


Journal ArticleDOI
03 Jun 2005
TL;DR: The design and realisation of a high level framework for the implementation of 1-D and 2-D FFTs for real-time applications and an FPGA-based parametrisable environment based on 2- D FFT is presented as a solution for frequency-domain image filtering application.
Abstract: Applications based on the fast Fourier transform (FFT), such as signal and image processing, require high computational power, plus the ability to experiment with algorithms. Reconfigurable hardware devices in the form of field programmable gate arrays (FPGAs) have been proposed as a way of obtaining high performance at an economical price. However, users must program FPGAs at a very low level and have a detailed knowledge of the architecture of the device being used. They do not therefore facilitate easy development of, or experimentation with, signal/image processing algorithms. To try to reconcile the dual requirements of high performance and ease of development, this paper reports on the design and realisation of a high level framework for the implementation of 1-D and 2-D FFTs for real-time applications. A wide range of FFT algorithms, including radix-2, radix-4, split-radix and fast Hartley transform (FHT) have been implemented under a common framework in order to enable the system designers to meet different system requirements. Results show that the parallel implementation of 2-D FFT achieves linear speed-up and real-time performance for large matrix sizes. Finally, an FPGA-based parametrisable environment based on 2-D FFT is presented as a solution for frequency-domain image filtering application.

138 citations


Proceedings ArticleDOI
20 Jun 2005
TL;DR: This paper looks at the advantages and disadvantages of FPGA technology, its suitability for image processing and computer vision tasks, and attempts to suggest some directions for the future.
Abstract: Reconfigurable hardware, in the form of Field Programmable Gate Arrays (FPGAs), is becoming increasingly attractive for digital signal processing problems, including image processing and computer vision tasks. The ability to exploit the parallelism often found in these problems, as well as the ability to support different modes of operation on a single hardware substrate, gives these devices a particular advantage over fixed architecture devices such as serial CPUs and DSPs. Further, development times are substantially shorter than dedicated hardware in the form of Application Specific ICs (ASICs), and small changes to a design can be prototyped in a matter of hours. On the other hand, designing with FPGAs still requires expertise beyond that found in many vision labs today. This paper looks at the advantages and disadvantages of FPGA technology, its suitability for image processing and computer vision tasks, and attempts to suggest some directions for the future.

131 citations


Journal ArticleDOI
TL;DR: The DFG merging process identifies similarities among the DFGs, and produces a single datapath that can be dynamically reconfigured and has a minimum area cost, when considering both hardware blocks and interconnections.
Abstract: Reconfigurable systems have been shown to achieve significant performance speedup through architectures that map the most time-consuming application kernel modules or inner loops to a reconfigurable datapath. As each portion of the application starts to execute, the system partially reconfigures the datapath so as to perform the corresponding computation. The reconfigurable datapath should have as few and simple hardware blocks and interconnections as possible, in order to reduce its cost, area, and reconfiguration overhead. To achieve that, hardware blocks and interconnections should be reused as much as possible across the application. We represent each piece of the application as a data-flow graph (DFG). The DFG merging process identifies similarities among the DFGs, and produces a single datapath that can be dynamically reconfigured and has a minimum area cost, when considering both hardware blocks and interconnections. In this paper we present a novel technique for the DFG merge problem, and we evaluate it using programs from the MediaBench benchmark. Our algorithm execution time approaches the fastest previous solution to this problem and produces datapaths with an average area reduction of 20%. When compared to the best known area solution, our approach produces datapaths with area costs equivalent to (and in many cases better than) it, while achieving impressive speedups.

122 citations


Proceedings ArticleDOI
07 Mar 2005
TL;DR: It is shown that specific reconfigurable hardware support improves the performance of the heuristic and that task migration mechanisms need to be tailored to on-chip networks.
Abstract: Run-time management of both communication and computation resources in a heterogeneous Network-on-Chip (NoC) is a challenging task. First, platform resources need to be assigned in a fast and efficient way. Secondly, the resources might need to be reallocated when platform conditions or user requirements change. We developed a run-time resource management scheme that is able to efficiently manage a NoC containing fine grain reconfigurable hardware tiles. This paper details our task assignment heuristic and two run-time task migration mechanisms that deal with the message consistency problem in a NoC. We show that specific reconfigurable hardware tile support improves performance of the heuristic and that task migration mechanisms need to be tailored to on-chip networks.

Journal ArticleDOI
TL;DR: This work presents a new approach to compute multiple sequence alignments in far shorter time using reconfigurable hardware, which results in an implementation of ClustalW with significant runtime savings on a standard off-the-shelf FPGA.
Abstract: Summary: Aligning hundreds of sequences using progressive alignment tools such as ClustalW requires several hours on state-of-the-art workstations. We present a new approach to compute multiple sequence alignments in far shorter time using reconfigurable hardware. This results in an implementation of ClustalW with significant runtime savings on a standard off-the-shelf FPGA. Availability: An online server for ClustalW running on a Pentium IV 3 GHz with a Xilinx XC2V6000 FPGA PCI-board is available at http://beta.projectproteus.org. The PE hardware design in Verilog HDL is available on request from the first author. Contact: tim.oliver@pmail.ntu.edu.sg

Proceedings ArticleDOI
07 Mar 2005
TL;DR: W warp processing is proposed, a technique capable of optimizing a software application by dynamically and transparently re-implementing critical software kernels as custom circuits in on-chip configurable logic, and it is demonstrated that the soft-core based warp processor achieves average speedups of 5.8 and energy reductions of 57% compared to the soft core alone.
Abstract: Field programmable gate arrays (FPGAs) provide designers with the ability to quickly create hardware circuits. Increases in FPGA configurable logic capacity and decreasing FPGA costs have enabled designers to more readily incorporate FPGAs in their designs. FPGA vendors have begun providing configurable soft processor cores that can be synthesized onto their FPGA products. While FPGAs with soft processor cores provide designers with increased flexibility, such processors typically have degraded performance and energy consumption compared to hard-core processors. Previously, we proposed warp processing, a technique capable of optimizing a software application by dynamically and transparently re-implementing critical software kernels as custom circuits in on-chip configurable logic. In this paper, we study the potential of a MicroBlaze soft-core based warp processing system to eliminate the performance and energy overhead of a soft-core processor compared to a hard-core processor. We demonstrate that the soft-core based warp processor achieves average speedups of 5.8 and energy reductions of 57% compared to the soft core alone. Our data shows that a soft-core based warp processor yields performance and energy consumption competitive with existing hard-core processors, thus expanding the usefulness of soft processor cores on FPGAs to a broader range of applications.

Journal ArticleDOI
TL;DR: A platform for evolving spiking neural networks on FPGAs is presented as a combination of three parts: a hardware substrate, a computing engine, and an adaptation mechanism.

Proceedings ArticleDOI
10 Oct 2005
TL;DR: The purpose of this study is to use the coarse-grained architecture for H264/AVC in order to determine at the physical level whether reconfigurable computing, high-performance and low-power can be obtained.
Abstract: Portable wireless multimedia approaches traditionally achieve the specified performance and power consumption with a hardwired accelerator implementation. Due to the increase of algorithm complexity (Shannon's law), flexibility is needed to achieve shorter development cycles. A coarse-grained reconfigurable computing concept for these requirements is discussed, which supports both flexible control decisions and repetitive numerical operations. The concept includes an architecture template and a compiler and simulator environment. The architecture provides flexible time-multiplexing of code for high-performance data processing while keeping the configuration bandwidth and power requirements low. The purpose of this study is to use the coarse-grained architecture for H264/AVC in order to determine at the physical level whether reconfigurable computing, high-performance and low-power can be obtained.

Journal ArticleDOI
TL;DR: The dynamic network on chip (DyNoC) is introduced as a viable communication infrastructure for communication on dynamically reconfigurable devices and algorithms and implementation results from real-life problems are provided.
Abstract: This article presents two approaches to solving the problem of communication between components dynamically placed at runtime on a reconfigurable device. The first is a circuit-routing approach designed for existing FPGAs. This approach uses the reconfigurable multiple bus (RMB). The second, network-based approach targets devices with unlimited reconfiguration capability such as coarse-grained reconfigurable devices. We introduce the dynamic network on chip (DyNoC) as a viable communication infrastructure for communication on dynamically reconfigurable devices. For prototyping the DyNoC on FPGAs, we design and implement an unrestricted communication model for a columnwise-reconfigurable chip. For the DyNoC, as well as for the RMB on chip (RMBoC), we provide algorithms and implementation results from real-life problems.

Journal ArticleDOI
TL;DR: A hardware Gaussian noise generator based on the Wallace method used for a hardware simulation system that accurately models a true Gaussian probability density function even at high /spl sigma/ values is described.
Abstract: We describe a hardware Gaussian noise generator based on the Wallace method used for a hardware simulation system. Our noise generator accurately models a true Gaussian probability density function even at high /spl sigma/ values. We evaluate its properties using: 1) several different statistical tests, including the chi-square test and the Anderson-Darling test and 2) an application for decoding of low-density parity-check (LDPC) codes. Our design is implemented on a Xilinx Virtex-II XC2V4000-6 field-programmable gate array (FPGA) at 155 MHz; it takes up 3% of the device and produces 155 million samples per second, which is three times faster than a 2.6-GHz Pentium-IV PC. Another implementation on a Xilinx Spartan-III XC3S200E-5 FPGA at 106 MHz is two times faster than the software version. Further improvement in performance can be obtained by concurrent execution: 20 parallel instances of the noise generator on an XC2V4000-6 FPGA at 115 MHz can run 51 times faster than software on a 2.6-GHz Pentium-IV PC.

Proceedings ArticleDOI
18 Apr 2005
TL;DR: Reconfigurable computing as it could be used in mainstream systems is examined, focusing on a proposed scheduling algorithm to allocate the reconfigurable hardware.
Abstract: Although many studies have demonstrated the benefits of reconfigurable computing, it has not yet penetrated the mainstream. One of the biggest unsolved problems is the management of the reconfigurable hardware in a multi-threaded environment. Most research in reconfigurable computing has assumed a single-threaded model, but this is unrealistic for personal computing and many types of embedded computing. In these cases, there may be several different threads or processes running simultaneously, each wishing to use the reconfigurable hardware. The operating system must decide how to allocate the hardware at run-time based on the status of the system. The system status could also influence the choice of different implementations for each circuit based on area/speed tradeoffs. This paper examines reconfigurable computing as it could be used in mainstream systems, focusing on a proposed scheduling algorithm to allocate the reconfigurable hardware. Our initial tests indicate that reconfigurable computing with our scheduler can easily achieve at least a 20% system-level speedup.

Proceedings ArticleDOI
04 Apr 2005
TL;DR: The implementation of an FPGA-based hardware implementation designed to accelerate the BLAST algorithm, a standard computer application that molecular biologists use to search for sequence similarity in genomic databases.
Abstract: Basic Local Alignment Search Tool (BLAST) is a standard computer application that molecular biologists use to search for sequence similarity in genomic databases. This paper describes the implementation of an FPGA-based hardware implementation designed to accelerate the BLAST algorithm. FPGA-based custom computing machines, more widely known as reconfigurable computing, are supported by a number of vendors and the basic cost of FPGA hardware is dramatically decreasing. Hence, the main objective of this project is to explore the feasibility of using this new technology to realize a portable, open source FPGA-based accelerator for the BLAST algorithm. The present design is targeted to an AceIIcard and the design is based on the latest version of BLAST available from NCBI. Since the entire application does not fit in hardware, a profile study was conducted that identifies the computationally intensive part of BLAST. An FPGA hardware component has been designed and implemented for this critical segment. The portability and cost-effectiveness of the design are discussed.

Journal ArticleDOI
TL;DR: This work introduces multithreaded programming model for reconfigurable computing based on a unified virtual-memory image for both software and hardware application parts and addresses the challenge of achieving seamless hardware-software interfacing and portability with minimal performance penalties.
Abstract: Ideally, reconfigurable-system programmers and designers should code algorithms and write hardware accelerators independently of the underlying platform. To realize this scenario, the authors propose a portable, hardware-agnostic programming paradigm, which delegates platform-specific tasks to a system-level virtualization layer. This layer supports a chosen programming model and hides platform details from users much as general-purpose computers do. We introduce multithreaded programming model for reconfigurable computing based on a unified virtual-memory image for both software and hardware application parts. We also address the challenge of achieving seamless hardware-software interfacing and portability with minimal performance penalties.

Proceedings ArticleDOI
01 May 2005
TL;DR: Experimental results show that, using a high-reliable low-cost mitigation technique, the availability of an FPGA mapped design can be increases to more than 99%.
Abstract: FPGA-based designs are more susceptible to single-event up-sets (SEUs) compared to ASIC designs, since SEUs in configuration bits of FPGAs result in permanent errors in the mapped design. Moreover, the number of sensitive configuration bits is two orders of magnitude more than user bits in typical FPGA-based circuits. In this paper, we present a high-reliable low-cost mitigation technique which can significantly improve the availability of designs mapped into FPGAs. Experimental results show that, using this technique, the availability of an FPGA mapped design can be increases to more than 99%.

Proceedings ArticleDOI
10 Oct 2005
TL;DR: A hardware perfect-hashing technique is introduced to access the memory that contains the matching patterns to detect hazardous contents using pattern matching and achieves at least 30% better efficiency compared to previous work, measured in throughput per area required per matching character.
Abstract: In this paper, we consider scanning and analyzing packets in order to detect hazardous contents using pattern matching. We introduce a hardware perfect-hashing technique to access the memory that contains the matching patterns. A subsequent simple comparison between incoming data and memory output determines the match. We implement our scheme in reconfigurable hardware and show that we can achieve a throughput between 1.7 and 5.7 Gbps requiring only a few tens of FPGA memory blocks and 0.30 to 0.57 logic cells per matching character. We also show that our designs achieve at least 30% better efficiency compared to previous work, measured in throughput per area required per matching character.

Proceedings ArticleDOI
18 Apr 2005
TL;DR: Three implementation alternatives for the fast Fourier transform (FFT) on FPGA are explored and the results indicate that FPGAs are competitive with microprocessors in terms of performance and that the "correct" FFT implementation varies based on the size of the transform and the sizes of the FGPAs.
Abstract: Advances in FPGA technology have led to dramatic improvements in double precision floating-point performance. Modern FPGAs boast several GigaFLOPs of raw computing power. Unfortunately, this computing power is distributed across 30 floating-point units with over 10 cycles of latency each. The user must find two orders of magnitude more parallelism than is typically exploited in a single microprocessor; thus, it is not clear that the computational power of FPGAs can be exploited across a wide range of algorithms. This paper explores three implementation alternatives for the fast Fourier transform (FFT) on FPGAs. The algorithms are compared in terms of sustained performance and memory requirements for various FFT sizes and FPGA sizes. The results indicate that FPGAs are competitive with microprocessors in terms of performance and that the "correct" FFT implementation varies based on the size of the transform and the size of the FPGA.

Proceedings ArticleDOI
07 Mar 2005
TL;DR: A hybrid design/run-time prefetch heuristic is developed that schedules the reconfigurations at run-time, but carries out the scheduling computations at design-time by carefully identifying a set of near-optimal schedules that can be selected atRun-time.
Abstract: Due to the emergence of highly dynamic multimedia applications there is a need for flexible platforms and run-time scheduling support for embedded systems. Dynamic Reconfigurable Hardware (DRHW) is a promising candidate to provide this flexibility but, currently, not sufficient run-time scheduling support to deal with the run-time reconfigurations exists. Moreover, executing at run-time a complex scheduling heuristic to provide this support may generate an excessive run-time penalty. Hence, we have developed a hybrid design/run-time prefetch heuristic that schedules the reconfigurations at run-time, but carries out the scheduling computations at design-time by carefully identifying a set of near-optimal schedules that can be selected at run-time. This approach provides run-time flexibility with a negligible penalty.

Journal ArticleDOI
01 Mar 2005
TL;DR: This paper presents an FPGA implementation of the parallel-beam backprojection algorithm used in CT for which all the requirements are met and shows approximately 100 times speedup over software versions of the same algorithm running on a 1 GHz Pentium, and is more flexible than an ASIC implementation.
Abstract: Medical image processing in general and computerized tomography (CT) in particular can benefit greatly from hardware acceleration. This application domain is marked by computationally intensive algorithms requiring the rapid processing of large amounts of data. To date, reconfigurable hardware has not been applied to the important area of image reconstruction. For efficient implementation and maximum speedup, fixed-point implementations are required. The associated quantization errors must be carefully balanced against the requirements of the medical community. Specifically, care must be taken so that very little error is introduced compared to floating-point implementations and the visual quality of the images is not compromised. In this paper, we present an FPGA implementation of the parallel-beam backprojection algorithm used in CT for which all of these requirements are met. We explore a number of quantization issues arising in backprojection and concentrate on minimizing error while maximizing efficiency. Our implementation shows approximately 100 times speedup over software versions of the same algorithm running on a 1 GHz Pentium, and is more flexible than an ASIC implementation. Our FPGA implementation can easily be adapted to both medical sensors with different dynamic ranges as well as tomographic scanners employed in a wider range of application areas including nondestructive evaluation and baggage inspection in airport terminals.

Journal ArticleDOI
TL;DR: This article describes an approach to dynamic reconfiguration that reduces reconfigurations latency to the point where dynamic multimedia applications can now exploit such platforms.
Abstract: Dynamic reconfiguration has been a technology solution in search of the right problem to solve. Effective use of the technology requires new programming and task management models. This article describes an approach to dynamic reconfiguration that reduces reconfiguration latency to the point where dynamic multimedia applications can now exploit such platforms.

Proceedings ArticleDOI
13 Oct 2005
TL;DR: In this paper, the development of FPGA-based ANNs is presented, and ANN implementation with hardware, mainlyFPGA, is presented and discussed.
Abstract: In this paper, the development of FPGA-based ANNs is presented Field programmable gate arrays (FPGA) based artificial neural network (ANN) is now becoming a focus of ANN research According to the parallelism and nature features of neural network (NN), the hardware implementation is superior comparing with software approach because it can take advantage of these characteristics Furthermore, since FPGA is a digital device that owns reprogrammable properties and robust flexibility, many researchers have made great efforts on the realization of NN using FPGA technique However, they encountered many problems during the process Results of the researches are presented and discussed The future development of FPGA implementation of ANN is also prospected Firstly, the introduction of ANN and FPGA technique is briefly shown in this paper Then ANN implementation with hardware, mainly FPGA, is presented and discussed

Proceedings ArticleDOI
10 Oct 2005
TL;DR: An FPGA-based pre-filter is presented that reduces the amount of traffic sent to a software-based NIDS for inspection and can reduce up to 90% of network traffic that would have otherwise been processed by Snort software.
Abstract: Software-based network intrusion detection systems (NIDS) often fail to keep up with high-speed network links. In this paper an FPGA-based pre-filter is presented that reduces the amount of traffic sent to a software-based NIDS for inspection. Simulations using real network traces and the Snort rule set show that a pre-filter can reduce up to 90% of network traffic that would have otherwise been processed by Snort software. The projected performance enables a computer to perform real-time intrusion detection of malicious content passing over a 10 Gbps network using FPGA hardware that operates with 10 Gbps of throughput and software that needs only to operate with 1 Gbps of throughput.

Proceedings ArticleDOI
11 Dec 2005
TL;DR: This work presents a new concept as well as the implementation of an FPGA based reconfigurable platform, the Erlangen slot machine (ESM), allowing an unrestricted relocation of modules on the device.
Abstract: We present a new concept as well as the implementation of an FPGA based reconfigurable platform, the Erlangen slot machine (ESM). One main advantage of this platform is the possibility for each module to access its periphery independent from its location through a programmable crossbar, allowing an unrestricted relocation of modules on the device. Furthermore, we propose different intermodule communication structures.

Patent
23 Sep 2005
TL;DR: In this article, a method of developing peripherals for integration with a vehicle control system comprises providing a vehicle controller and interconnection system that includes a system core for processing data, an input module and an output module.
Abstract: A method of developing peripherals for integration with a vehicle control system comprises providing a vehicle control and interconnection system that includes a system core for processing data, an input module and an output module. The system core includes a reconfigurable space having reconfigurable hardware, memory and a supervising processor that is customized to the order. The supervising processor is configured to provide control information to identified peripherals and control the allocation and configuration of the reconfigurable space into a plurality of independent information processing workspaces. The associated information processing workspace for the peripherals is configured if required, a verifying operation of the peripherals with the control and interconnection system is performed and the peripherals are authorized as approved peripherals. To integrate the peripherals into the system, design tools assist the developer in configuring an associated information processing workspace, setting up operating conditions or performing other integration tasks.