Showing papers on "Reconfigurable computing published in 2005"

PDF

Open Access

Journal Article•DOI•

Reconfigurable computing: architectures and design methods

[...]

Tim Todman¹, George A. Constantinides¹, Steven J. E. Wilton², Oskar Mencer¹, Wayne Luk¹, Peter Y. K. Cheung¹ - Show less +2 more•Institutions (2)

Imperial College London¹, University of British Columbia²

25 Jul 2005

TL;DR: It is shown that reconfigurable computing designs are capable of achieving up to 500 times speedup and 70% energy savings over microprocessor implementations for specific applications.

...read moreread less

Abstract: Reconfigurable computing is becoming increasingly attractive for many applications. This survey covers two aspects of reconfigurable computing: architectures and design methods. The paper includes recent advances in reconfigurable architectures, such as the Alters Stratix II and Xilinx Virtex 4 FPGA devices. The authors identify major trends in general-purpose and special-purpose design methods. It is shown that reconfigurable computing designs are capable of achieving up to 500 times speedup and 70% energy savings over microprocessor implementations for specific applications.

...read moreread less

414 citations

Journal Article•DOI•

BEE2: a high-end reconfigurable computing system

[...]

Chen Chang¹, John Wawrzynek¹, Robert W. Brodersen¹•Institutions (1)

University of California, Berkeley¹

01 Mar 2005-IEEE Design & Test of Computers

TL;DR: The Berkeley Emulation Engine 2 (BEE2) project is developing a reusable, modular, and scalable framework for designing high-end reconfigurable computers, including a processing-module building block and several programming models.

...read moreread less

Abstract: The Berkeley Emulation Engine 2 (BEE2) project is developing a reusable, modular, and scalable framework for designing high-end reconfigurable computers, including a processing-module building block and several programming models. Using these elements, BEE2 can provide over 10 times more computing throughput than a DSP-based system with similar power consumption and cost and over 100 times that of a microprocessor-based system.

...read moreread less

304 citations

Proceedings Article•DOI•

Efficient packet classification for network intrusion detection using FPGA

[...]

Haoyu Song¹, John W. Lockwood¹•Institutions (1)

Washington University in St. Louis¹

20 Feb 2005

TL;DR: A novel packet classification architecture called BV-TCAM is presented, which is implemented for an FPGA-based Network Intrusion Detection System (NIDS), which can report multiple matches at gigabit per second network link rates.

...read moreread less

Abstract: Using FPGA technology for real-time network intrusion detection has gained many research efforts recently. In this paper, a novel packet classification architecture called BV-TCAM is presented, which is implemented for an FPGA-based Network Intrusion Detection System (NIDS). The classifier can report multiple matches at gigabit per second network link rates. The BV-TCAM architecture combines the Ternary Content Addressable Memory (TCAM) and the Bit Vector (BV) algorithm to effectively compress the data representations and boost throughput. A tree-bitmap implementation of the BV algorithm is used for source and destination port lookup while a TCAM performs the lookup of the other header fields, which can be represented as a prefix or exact value. The architecture eliminates the requirement for prefix expansion of port ranges. With the aid of a small embedded TCAM, packet classification can be implemented in a relatively small part of the available logic of an FPGA. The design is prototyped and evaluated in a Xilinx FPGA XCV2000E on the FPX platform. Even with the most difficult set of rules and packet inputs, the circuit is fast enough to sustain OC48 traffic throughput. Using larger and faster FPGAs, the system can work at speeds greater than OC192.

...read moreread less

234 citations

Journal Article•DOI•

A detailed power model for field-programmable gate arrays

[...]

Kara K. W. Poon¹, Steven J. E. Wilton¹, Andy Yan¹•Institutions (1)

University of British Columbia¹

01 Apr 2005-ACM Transactions on Design Automation of Electronic Systems

TL;DR: A detailed and flexible power model which has been integrated in the widely used Versatile Place and Route (VPR) CAD tool is described, which estimates the dynamic, short-circuit, and leakage power consumed by FPGAs.

...read moreread less

Abstract: Power has become a critical issue for field-programmable gate array (FPGA) vendors. Understanding the power dissipation within FPGAs is the first step in developing power-efficient architectures and computer-aided design (CAD) tools for FPGAs. This article describes a detailed and flexible power model which has been integrated in the widely used Versatile Place and Route (VPR) CAD tool. This power model estimates the dynamic, short-circuit, and leakage power consumed by FPGAs. It is the first flexible power model developed to evaluate architectural tradeoffs and the efficiency of power-aware CAD tools for a variety of FPGA architectures, and is freely available for noncommercial use. The model is flexible, in that it can estimate the power for a wide variety of FPGA architectures, and it is fast, in that it does not require extensive simulation, meaning it can be used to explore a large architectural space. We show how the model can be used to investigate the impact of various architectural parameters on the energy consumed by the FPGA, focusing on the segment length, switch block topology, lookuptable size, and cluster size.

...read moreread less

187 citations

Book•

Reconfigurable Computing: Accelerating Computation with Field-Programmable Gate Arrays

[...]

Maya Gokhale, Paul Graham

14 Dec 2005

TL;DR: A one-of-a-kind survey of the field of Reconfigurable Computing gives a comprehensive introduction to a discipline that offers a 10X-100X acceleration of algorithms over microprocessors.

...read moreread less

Abstract: A one-of-a-kind survey of the field of Reconfigurable Computing Gives a comprehensive introduction to a discipline that offers a 10X-100X acceleration of algorithms over microprocessors Discusses the impact of reconfigurable hardware on a wide range of applications: signal and image processing, network security, bioinformatics, and supercomputing Includes the history of the field as well as recent advances Includes an extensive bibliography of primary sources

...read moreread less

178 citations

Book•

Practical FPGA programming in C

[...]

David Pellerin, Scott Thibault

01 Jan 2005

TL;DR: C-based techniques for building high-performance, FPGA-accelerated software applicationsCircuits, Devices, and Systems C-based Techniques for Optimizing FPGAs Performance, Design Flexibility, and Time to Market forward is introduced.

...read moreread less

Abstract: C-based techniques for building high-performance, FPGA-accelerated software applicationsCircuits, Devices, and SystemsC-based Techniques for Optimizing FPGA Performance, Design Flexibility, and Time to MarketForward written by Clive "Max" Maxfield. High-performance FPGA-accelerated software applications are a growing demand in fields ranging from communications and image processing to biomedical and scientific computing. This book introduces powerful, C-based parallel-programming techniques for creating these applications, verifying them, and moving them into FPGA hardware.The authors bridge the chasm between "conventional" software development and the methods and philosophies of FPGA-based digital design. Software engineers will learn to look at FPGAs as "just another programmable computing resource," while achieving phenomenal performance because much of their code is running directly in hardware. Hardware engineers will master techniques that perfectly complement their existing HDL expertise, while allowing them to explore design alternatives and create prototypes far more rapidly. Both groups will learn how to leverage C to support efficient hardware/software co-design and improve compilation, debugging, and testing. Understand when C makes sense in FPGA development and where it fits into your existing processes Leverage C to implement software applications directly onto mixed hardware/software platforms Execute and test the same C algorithms in desktop PC environments and in-system using embedded processors Master new, C-based programming models and techniques optimized for highly parallel FPGA platforms Supercharge performance by optimizing through automated compilation Use multiple-process streaming programming models to deliver truly astonishing performance Preview the future of FPGA computing Study an extensive set of realistic C code examplesAbout the Web SiteVisit http://www.ImpulseC.com/practical to download fully operational, time-limited versions of a C-based FPGA design compiler, as well as additional examples and programming tips. © Copyright Pearson Education. All rights reserved.

...read moreread less

176 citations

Patent•

Network-Based System for Configuring a Programmable Hardware Element in a Measurement System using Hardware Configuration Programs Generated Based on a User Specification

[...]

Joseph E. Peck¹, Matthew Novacek¹, Hugo A. Andrade¹, Newton G. Petersen¹, Ganesh Ranganathan¹, Brian Sierer¹, John Pasquarette¹ - Show less +3 more•Institutions (1)

National Instruments¹

18 Jan 2005

TL;DR: In this article, the authors present a system and method for online configuration of a measurement system, where the user can access a server over a network and specify a desired task, and receive programs and/or configuration information which are usable to configure the user's measurement system hardware (and/or software) to perform the desired task.

...read moreread less

Abstract: A system and method for online configuration of a measurement system. The user may access a server over a network and specify a desired task, e.g., a measurement task, and receive programs and/or configuration information which are usable to configure the user's measurement system hardware (and/or software) to perform the desired task. Additionally, if the user does not have the hardware required to perform the task, the required hardware may be sent to the user, along with programs and/or configuration information. The hardware may be reconfigurable hardware, such as an FPGA or a processor/memory based device. In one embodiment, the required hardware may be pre-configured to perform the task before being sent to the user. In another embodiment, the system and method may provide a graphical program in response to receiving the user's task specification, where the graphical program may be usable by the measurement system to perform the task.

...read moreread less

145 citations

Journal Article•DOI•

FPGA implementations of fast Fourier transforms for real-time signal and image processing

[...]

I.S. Uzun¹, Abbes Amira¹, Ahmed Bouridane¹•Institutions (1)

Queen's University Belfast¹

03 Jun 2005

TL;DR: The design and realisation of a high level framework for the implementation of 1-D and 2-D FFTs for real-time applications and an FPGA-based parametrisable environment based on 2- D FFT is presented as a solution for frequency-domain image filtering application.

...read moreread less

Abstract: Applications based on the fast Fourier transform (FFT), such as signal and image processing, require high computational power, plus the ability to experiment with algorithms. Reconfigurable hardware devices in the form of field programmable gate arrays (FPGAs) have been proposed as a way of obtaining high performance at an economical price. However, users must program FPGAs at a very low level and have a detailed knowledge of the architecture of the device being used. They do not therefore facilitate easy development of, or experimentation with, signal/image processing algorithms. To try to reconcile the dual requirements of high performance and ease of development, this paper reports on the design and realisation of a high level framework for the implementation of 1-D and 2-D FFTs for real-time applications. A wide range of FFT algorithms, including radix-2, radix-4, split-radix and fast Hartley transform (FHT) have been implemented under a common framework in order to enable the system designers to meet different system requirements. Results show that the parallel implementation of 2-D FFT achieves linear speed-up and real-time performance for large matrix sizes. Finally, an FPGA-based parametrisable environment based on 2-D FFT is presented as a solution for frequency-domain image filtering application.

...read moreread less

138 citations

Proceedings Article•DOI•

An Evaluation of the Suitability of FPGAs for Embedded Vision Systems

[...]

W.J. MacLean¹•Institutions (1)

University of Toronto¹

20 Jun 2005

TL;DR: This paper looks at the advantages and disadvantages of FPGA technology, its suitability for image processing and computer vision tasks, and attempts to suggest some directions for the future.

...read moreread less

Abstract: Reconfigurable hardware, in the form of Field Programmable Gate Arrays (FPGAs), is becoming increasingly attractive for digital signal processing problems, including image processing and computer vision tasks. The ability to exploit the parallelism often found in these problems, as well as the ability to support different modes of operation on a single hardware substrate, gives these devices a particular advantage over fixed architecture devices such as serial CPUs and DSPs. Further, development times are substantially shorter than dedicated hardware in the form of Application Specific ICs (ASICs), and small changes to a design can be prototyped in a matter of hours. On the other hand, designing with FPGAs still requires expertise beyond that found in many vision labs today. This paper looks at the advantages and disadvantages of FPGA technology, its suitability for image processing and computer vision tasks, and attempts to suggest some directions for the future.

...read moreread less

131 citations

Journal Article•DOI•

Efficient datapath merging for partially reconfigurable architectures

[...]

Nahri Moreano¹, Edson Borin², Cid C. de Souza², Guido Araujo²•Institutions (2)

Federal University of Mato Grosso do Sul¹, State University of Campinas²

27 Jun 2005-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: The DFG merging process identifies similarities among the DFGs, and produces a single datapath that can be dynamically reconfigured and has a minimum area cost, when considering both hardware blocks and interconnections.

...read moreread less

Abstract: Reconfigurable systems have been shown to achieve significant performance speedup through architectures that map the most time-consuming application kernel modules or inner loops to a reconfigurable datapath. As each portion of the application starts to execute, the system partially reconfigures the datapath so as to perform the corresponding computation. The reconfigurable datapath should have as few and simple hardware blocks and interconnections as possible, in order to reduce its cost, area, and reconfiguration overhead. To achieve that, hardware blocks and interconnections should be reused as much as possible across the application. We represent each piece of the application as a data-flow graph (DFG). The DFG merging process identifies similarities among the DFGs, and produces a single datapath that can be dynamically reconfigured and has a minimum area cost, when considering both hardware blocks and interconnections. In this paper we present a novel technique for the DFG merge problem, and we evaluate it using programs from the MediaBench benchmark. Our algorithm execution time approaches the fastest previous solution to this problem and produces datapaths with an average area reduction of 20%. When compared to the best known area solution, our approach produces datapaths with area costs equivalent to (and in many cases better than) it, while achieving impressive speedups.

...read moreread less

122 citations

Proceedings Article•DOI•

Centralized Run-Time Resource Management in a Network-on-Chip Containing Reconfigurable Hardware Tiles

[...]

V. Nollet¹, Théodore Marescaux¹, Prabhat Avasare¹, Diederik Verkest¹, J.-Y. Mignolet¹ - Show less +1 more•Institutions (1)

Katholieke Universiteit Leuven¹

07 Mar 2005

TL;DR: It is shown that specific reconfigurable hardware support improves the performance of the heuristic and that task migration mechanisms need to be tailored to on-chip networks.

...read moreread less

Abstract: Run-time management of both communication and computation resources in a heterogeneous Network-on-Chip (NoC) is a challenging task. First, platform resources need to be assigned in a fast and efficient way. Secondly, the resources might need to be reallocated when platform conditions or user requirements change. We developed a run-time resource management scheme that is able to efficiently manage a NoC containing fine grain reconfigurable hardware tiles. This paper details our task assignment heuristic and two run-time task migration mechanisms that deal with the message consistency problem in a NoC. We show that specific reconfigurable hardware tile support improves performance of the heuristic and that task migration mechanisms need to be tailored to on-chip networks.

...read moreread less

Journal Article•DOI•

Using reconfigurable hardware to accelerate multiple sequence alignment with ClustalW

[...]

Tim Oliver¹, Bertil Schmidt¹, Darran Nathan², Ralf Clemens², Douglas L. Maskell¹ - Show less +1 more•Institutions (2)

Nanyang Technological University¹, Ngee Ann Polytechnic²

15 Aug 2005-Bioinformatics

TL;DR: This work presents a new approach to compute multiple sequence alignments in far shorter time using reconfigurable hardware, which results in an implementation of ClustalW with significant runtime savings on a standard off-the-shelf FPGA.

...read moreread less

Abstract: Summary: Aligning hundreds of sequences using progressive alignment tools such as ClustalW requires several hours on state-of-the-art workstations. We present a new approach to compute multiple sequence alignments in far shorter time using reconfigurable hardware. This results in an implementation of ClustalW with significant runtime savings on a standard off-the-shelf FPGA. Availability: An online server for ClustalW running on a Pentium IV 3 GHz with a Xilinx XC2V6000 FPGA PCI-board is available at http://beta.projectproteus.org. The PE hardware design in Verilog HDL is available on request from the first author. Contact: tim.oliver@pmail.ntu.edu.sg

...read moreread less

Proceedings Article•DOI•

A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning

[...]

Roman Lysecky¹, Frank Vahid¹•Institutions (1)

University of California, Riverside¹

07 Mar 2005

TL;DR: W warp processing is proposed, a technique capable of optimizing a software application by dynamically and transparently re-implementing critical software kernels as custom circuits in on-chip configurable logic, and it is demonstrated that the soft-core based warp processor achieves average speedups of 5.8 and energy reductions of 57% compared to the soft core alone.

...read moreread less

Abstract: Field programmable gate arrays (FPGAs) provide designers with the ability to quickly create hardware circuits. Increases in FPGA configurable logic capacity and decreasing FPGA costs have enabled designers to more readily incorporate FPGAs in their designs. FPGA vendors have begun providing configurable soft processor cores that can be synthesized onto their FPGA products. While FPGAs with soft processor cores provide designers with increased flexibility, such processors typically have degraded performance and energy consumption compared to hard-core processors. Previously, we proposed warp processing, a technique capable of optimizing a software application by dynamically and transparently re-implementing critical software kernels as custom circuits in on-chip configurable logic. In this paper, we study the potential of a MicroBlaze soft-core based warp processing system to eliminate the performance and energy overhead of a soft-core processor compared to a hard-core processor. We demonstrate that the soft-core based warp processor achieves average speedups of 5.8 and energy reductions of 57% compared to the soft core alone. Our data shows that a soft-core based warp processor yields performance and energy consumption competitive with existing hard-core processors, thus expanding the usefulness of soft processor cores on FPGAs to a broader range of applications.

...read moreread less

Journal Article•DOI•

An FPGA platform for on-line topology exploration of spiking neural networks

[...]

Andres Upegui¹, Carlos Andrés Peña-Reyes¹, Eduardo Sanchez¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

01 Jun 2005-Microprocessors and Microsystems

TL;DR: A platform for evolving spiking neural networks on FPGAs is presented as a combination of three parts: a hardware substrate, a computing engine, and an adaptation mechanism.

...read moreread less

Proceedings Article•DOI•

Custom implementation of the coarse-grained reconfigurable ADRES architecture for multimedia purposes

[...]

Francisco-Javier Veredas, M. Scheppler, W. Moffat, Bingfeng Mei

10 Oct 2005

TL;DR: The purpose of this study is to use the coarse-grained architecture for H264/AVC in order to determine at the physical level whether reconfigurable computing, high-performance and low-power can be obtained.

...read moreread less

Abstract: Portable wireless multimedia approaches traditionally achieve the specified performance and power consumption with a hardwired accelerator implementation. Due to the increase of algorithm complexity (Shannon's law), flexibility is needed to achieve shorter development cycles. A coarse-grained reconfigurable computing concept for these requirements is discussed, which supports both flexible control decisions and repetitive numerical operations. The concept includes an architecture template and a compiler and simulator environment. The architecture provides flexible time-multiplexing of code for high-performance data processing while keeping the configuration bandwidth and power requirements low. The purpose of this study is to use the coarse-grained architecture for H264/AVC in order to determine at the physical level whether reconfigurable computing, high-performance and low-power can be obtained.

...read moreread less

Journal Article•DOI•

Dynamic interconnection of reconfigurable modules on reconfigurable devices

[...]

Christophe Bobda, Ali Ahmadinia

01 Sep 2005-IEEE Design & Test of Computers

TL;DR: The dynamic network on chip (DyNoC) is introduced as a viable communication infrastructure for communication on dynamically reconfigurable devices and algorithms and implementation results from real-life problems are provided.

...read moreread less

Abstract: This article presents two approaches to solving the problem of communication between components dynamically placed at runtime on a reconfigurable device. The first is a circuit-routing approach designed for existing FPGAs. This approach uses the reconfigurable multiple bus (RMB). The second, network-based approach targets devices with unlimited reconfiguration capability such as coarse-grained reconfigurable devices. We introduce the dynamic network on chip (DyNoC) as a viable communication infrastructure for communication on dynamically reconfigurable devices. For prototyping the DyNoC on FPGAs, we design and implement an unrestricted communication model for a columnwise-reconfigurable chip. For the DyNoC, as well as for the RMB on chip (RMBoC), we provide algorithms and implementation results from real-life problems.

...read moreread less

Journal Article•DOI•

A hardware Gaussian noise generator using the Wallace method

[...]

Dong-U Lee¹, Wayne Luk¹, John Villasenor², Guanglie Zhang³, Philip H. W. Leong³ - Show less +1 more•Institutions (3)

Imperial College London¹, University of California, Los Angeles², The Chinese University of Hong Kong³

01 Aug 2005-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A hardware Gaussian noise generator based on the Wallace method used for a hardware simulation system that accurately models a true Gaussian probability density function even at high /spl sigma/ values is described.

...read moreread less

Abstract: We describe a hardware Gaussian noise generator based on the Wallace method used for a hardware simulation system. Our noise generator accurately models a true Gaussian probability density function even at high /spl sigma/ values. We evaluate its properties using: 1) several different statistical tests, including the chi-square test and the Anderson-Darling test and 2) an application for decoding of low-density parity-check (LDPC) codes. Our design is implemented on a Xilinx Virtex-II XC2V4000-6 field-programmable gate array (FPGA) at 155 MHz; it takes up 3% of the device and produces 155 million samples per second, which is three times faster than a 2.6-GHz Pentium-IV PC. Another implementation on a Xilinx Spartan-III XC3S200E-5 FPGA at 106 MHz is two times faster than the software version. Further improvement in performance can be obtained by concurrent execution: 20 parallel instances of the noise generator on an XC2V4000-6 FPGA at 115 MHz can run 51 times faster than software on a 2.6-GHz Pentium-IV PC.

...read moreread less

Proceedings Article•DOI•

An execution environment for reconfigurable computing

[...]

Wenyin Fu¹, Katherine Compton¹•Institutions (1)

University of Wisconsin-Madison¹

18 Apr 2005

TL;DR: Reconfigurable computing as it could be used in mainstream systems is examined, focusing on a proposed scheduling algorithm to allocate the reconfigurable hardware.

...read moreread less

Abstract: Although many studies have demonstrated the benefits of reconfigurable computing, it has not yet penetrated the mainstream. One of the biggest unsolved problems is the management of the reconfigurable hardware in a multi-threaded environment. Most research in reconfigurable computing has assumed a single-threaded model, but this is unrealistic for personal computing and many types of embedded computing. In these cases, there may be several different threads or processes running simultaneously, each wishing to use the reconfigurable hardware. The operating system must decide how to allocate the hardware at run-time based on the status of the system. The system status could also influence the choice of different implementations for each circuit based on area/speed tradeoffs. This paper examines reconfigurable computing as it could be used in mainstream systems, focusing on a proposed scheduling algorithm to allocate the reconfigurable hardware. Our initial tests indicate that reconfigurable computing with our scheduler can easily achieve at least a 20% system-level speedup.

...read moreread less

Proceedings Article•DOI•

RC-BLAST: towards a portable, cost-effective open source hardware implementation

[...]

K. Muriki¹, Keith D. Underwood², R. Sass³•Institutions (3)

Clemson University¹, Sandia National Laboratories², University of Kansas³

04 Apr 2005

TL;DR: The implementation of an FPGA-based hardware implementation designed to accelerate the BLAST algorithm, a standard computer application that molecular biologists use to search for sequence similarity in genomic databases.

...read moreread less

Abstract: Basic Local Alignment Search Tool (BLAST) is a standard computer application that molecular biologists use to search for sequence similarity in genomic databases. This paper describes the implementation of an FPGA-based hardware implementation designed to accelerate the BLAST algorithm. FPGA-based custom computing machines, more widely known as reconfigurable computing, are supported by a number of vendors and the basic cost of FPGA hardware is dramatically decreasing. Hence, the main objective of this project is to explore the feasibility of using this new technology to realize a portable, open source FPGA-based accelerator for the BLAST algorithm. The present design is targeted to an AceIIcard and the design is based on the latest version of BLAST available from NCBI. Since the entire application does not fit in hardware, a profile study was conducted that identifies the computationally intensive part of BLAST. An FPGA hardware component has been designed and implemented for this critical segment. The portability and cost-effectiveness of the design are discussed.

...read moreread less

Journal Article•DOI•

Seamless hardware-software integration in reconfigurable computing systems

[...]

M. Vuletid¹, Laura Pozzi¹, Paolo Ienne¹•Institutions (1)

École Normale Supérieure¹

01 Mar 2005-IEEE Design & Test of Computers

TL;DR: This work introduces multithreaded programming model for reconfigurable computing based on a unified virtual-memory image for both software and hardware application parts and addresses the challenge of achieving seamless hardware-software interfacing and portability with minimal performance penalties.

...read moreread less

Abstract: Ideally, reconfigurable-system programmers and designers should code algorithms and write hardware accelerators independently of the underlying platform. To realize this scenario, the authors propose a portable, hardware-agnostic programming paradigm, which delegates platform-specific tasks to a system-level virtualization layer. This layer supports a chosen programming model and hides platform details from users much as general-purpose computers do. We introduce multithreaded programming model for reconfigurable computing based on a unified virtual-memory image for both software and hardware application parts. We also address the challenge of achieving seamless hardware-software interfacing and portability with minimal performance penalties.

...read moreread less

Proceedings Article•DOI•

Soft error mitigation for SRAM-based FPGAs

[...]

Ghazanfar Asadi¹, Mehdi B. Tahoori¹•Institutions (1)

Northeastern University¹

01 May 2005

TL;DR: Experimental results show that, using a high-reliable low-cost mitigation technique, the availability of an FPGA mapped design can be increases to more than 99%.

...read moreread less

Abstract: FPGA-based designs are more susceptible to single-event up-sets (SEUs) compared to ASIC designs, since SEUs in configuration bits of FPGAs result in permanent errors in the mapped design. Moreover, the number of sensitive configuration bits is two orders of magnitude more than user bits in typical FPGA-based circuits. In this paper, we present a high-reliable low-cost mitigation technique which can significantly improve the availability of designs mapped into FPGAs. Experimental results show that, using this technique, the availability of an FPGA mapped design can be increases to more than 99%.

...read moreread less

Proceedings Article•DOI•

A reconfigurable perfect-hashing scheme for packet inspection

[...]

Ioannis Sourdis¹, Dionisios Pnevmatikatos¹, Stephan Wong¹, Stamatis Vassiliadis²•Institutions (2)

Delft University of Technology¹, Technical University of Crete²

10 Oct 2005

TL;DR: A hardware perfect-hashing technique is introduced to access the memory that contains the matching patterns to detect hazardous contents using pattern matching and achieves at least 30% better efficiency compared to previous work, measured in throughput per area required per matching character.

...read moreread less

Abstract: In this paper, we consider scanning and analyzing packets in order to detect hazardous contents using pattern matching. We introduce a hardware perfect-hashing technique to access the memory that contains the matching patterns. A subsequent simple comparison between incoming data and memory output determines the match. We implement our scheme in reconfigurable hardware and show that we can achieve a throughput between 1.7 and 5.7 Gbps requiring only a few tens of FPGA memory blocks and 0.30 to 0.57 logic cells per matching character. We also show that our designs achieve at least 30% better efficiency compared to previous work, measured in throughput per area required per matching character.

...read moreread less

Proceedings Article•DOI•

An analysis of the double-precision floating-point FFT on FPGAs

[...]

Karl Scott Hemmert¹, Keith D. Underwood¹•Institutions (1)

Sandia National Laboratories¹

18 Apr 2005

TL;DR: Three implementation alternatives for the fast Fourier transform (FFT) on FPGA are explored and the results indicate that FPGAs are competitive with microprocessors in terms of performance and that the "correct" FFT implementation varies based on the size of the transform and the sizes of the FGPAs.

...read moreread less

Abstract: Advances in FPGA technology have led to dramatic improvements in double precision floating-point performance. Modern FPGAs boast several GigaFLOPs of raw computing power. Unfortunately, this computing power is distributed across 30 floating-point units with over 10 cycles of latency each. The user must find two orders of magnitude more parallelism than is typically exploited in a single microprocessor; thus, it is not clear that the computational power of FPGAs can be exploited across a wide range of algorithms. This paper explores three implementation alternatives for the fast Fourier transform (FFT) on FPGAs. The algorithms are compared in terms of sustained performance and memory requirements for various FFT sizes and FPGA sizes. The results indicate that FPGAs are competitive with microprocessors in terms of performance and that the "correct" FFT implementation varies based on the size of the transform and the size of the FPGA.

...read moreread less

Proceedings Article•DOI•

A hybrid prefetch scheduling heuristic to minimize at run-time the reconfiguration overhead of dynamically reconfigurable hardware [multimedia applications]

[...]

Javier Resano¹, Daniel Mozos¹, Francky Catthoor²•Institutions (2)

Complutense University of Madrid¹, Katholieke Universiteit Leuven²

07 Mar 2005

TL;DR: A hybrid design/run-time prefetch heuristic is developed that schedules the reconfigurations at run-time, but carries out the scheduling computations at design-time by carefully identifying a set of near-optimal schedules that can be selected atRun-time.

...read moreread less

Abstract: Due to the emergence of highly dynamic multimedia applications there is a need for flexible platforms and run-time scheduling support for embedded systems. Dynamic Reconfigurable Hardware (DRHW) is a promising candidate to provide this flexibility but, currently, not sufficient run-time scheduling support to deal with the run-time reconfigurations exists. Moreover, executing at run-time a complex scheduling heuristic to provide this support may generate an excessive run-time penalty. Hence, we have developed a hybrid design/run-time prefetch heuristic that schedules the reconfigurations at run-time, but carries out the scheduling computations at design-time by carefully identifying a set of near-optimal schedules that can be selected at run-time. This approach provides run-time flexibility with a negligible penalty.

...read moreread less

Journal Article•DOI•

Parallel-Beam Backprojection: An FPGA Implementation Optimized for Medical Imaging

[...]

Miriam Leeser¹, Srdjan Coric¹, Eric L. Miller¹, Haiqian Yu¹, Marc Trepanier² - Show less +1 more•Institutions (2)

Northern Illinois University¹, Mercury Systems²

01 Mar 2005

TL;DR: This paper presents an FPGA implementation of the parallel-beam backprojection algorithm used in CT for which all the requirements are met and shows approximately 100 times speedup over software versions of the same algorithm running on a 1 GHz Pentium, and is more flexible than an ASIC implementation.

...read moreread less

Abstract: Medical image processing in general and computerized tomography (CT) in particular can benefit greatly from hardware acceleration. This application domain is marked by computationally intensive algorithms requiring the rapid processing of large amounts of data. To date, reconfigurable hardware has not been applied to the important area of image reconstruction. For efficient implementation and maximum speedup, fixed-point implementations are required. The associated quantization errors must be carefully balanced against the requirements of the medical community. Specifically, care must be taken so that very little error is introduced compared to floating-point implementations and the visual quality of the images is not compromised. In this paper, we present an FPGA implementation of the parallel-beam backprojection algorithm used in CT for which all of these requirements are met. We explore a number of quantization issues arising in backprojection and concentrate on minimizing error while maximizing efficiency. Our implementation shows approximately 100 times speedup over software versions of the same algorithm running on a 1 GHz Pentium, and is more flexible than an ASIC implementation. Our FPGA implementation can easily be adapted to both medical sensors with different dynamic ranges as well as tomographic scanners employed in a wider range of application areas including nondestructive evaluation and baggage inspection in airport terminals.

...read moreread less

Journal Article•DOI•

A reconfigurable manager for dynamically reconfigurable hardware

[...]

Javier Resano¹, Daniel Mozos¹, Diederik Verkest², F. Catthoor²•Institutions (2)

Complutense University of Madrid¹, Katholieke Universiteit Leuven²

01 Sep 2005-IEEE Design & Test of Computers

TL;DR: This article describes an approach to dynamic reconfiguration that reduces reconfigurations latency to the point where dynamic multimedia applications can now exploit such platforms.

...read moreread less

Abstract: Dynamic reconfiguration has been a technology solution in search of the right problem to solve. Effective use of the technology requires new programming and task management models. This article describes an approach to dynamic reconfiguration that reduces reconfiguration latency to the point where dynamic multimedia applications can now exploit such platforms.

...read moreread less

Proceedings Article•DOI•

A Survey of FPGA-Based Hardware Implementation of ANNs

[...]

Jihong Liu¹, Deqin Liang¹•Institutions (1)

Northeastern University¹

13 Oct 2005

TL;DR: In this paper, the development of FPGA-based ANNs is presented, and ANN implementation with hardware, mainlyFPGA, is presented and discussed.

...read moreread less

Abstract: In this paper, the development of FPGA-based ANNs is presented Field programmable gate arrays (FPGA) based artificial neural network (ANN) is now becoming a focus of ANN research According to the parallelism and nature features of neural network (NN), the hardware implementation is superior comparing with software approach because it can take advantage of these characteristics Furthermore, since FPGA is a digital device that owns reprogrammable properties and robust flexibility, many researchers have made great efforts on the realization of NN using FPGA technique However, they encountered many problems during the process Results of the researches are presented and discussed The future development of FPGA implementation of ANN is also prospected Firstly, the introduction of ANN and FPGA technique is briefly shown in this paper Then ANN implementation with hardware, mainly FPGA, is presented and discussed

...read moreread less

Proceedings Article•DOI•

Snort offloader: a reconfigurable hardware NIDS filter

[...]

Haoyu Song¹, Todd Sproull¹, M. Attig¹, John W. Lockwood¹•Institutions (1)

Washington University in St. Louis¹

10 Oct 2005

TL;DR: An FPGA-based pre-filter is presented that reduces the amount of traffic sent to a software-based NIDS for inspection and can reduce up to 90% of network traffic that would have otherwise been processed by Snort software.

...read moreread less

Abstract: Software-based network intrusion detection systems (NIDS) often fail to keep up with high-speed network links. In this paper an FPGA-based pre-filter is presented that reduces the amount of traffic sent to a software-based NIDS for inspection. Simulations using real network traces and the Snort rule set show that a pre-filter can reduce up to 90% of network traffic that would have otherwise been processed by Snort software. The projected performance enables a computer to perform real-time intrusion detection of malicious content passing over a 10 Gbps network using FPGA hardware that operates with 10 Gbps of throughput and software that needs only to operate with 1 Gbps of throughput.

...read moreread less

Proceedings Article•DOI•

The Erlangen slot machine: increasing flexibility in FPGA-based reconfigurable platforms

[...]

Christophe Bobda¹, A. Majer¹, Ali Ahmadinia¹, Thomas Haller¹, A. Linarth¹, Juergen Teich¹ - Show less +2 more•Institutions (1)

University of Erlangen-Nuremberg¹

11 Dec 2005

TL;DR: This work presents a new concept as well as the implementation of an FPGA based reconfigurable platform, the Erlangen slot machine (ESM), allowing an unrestricted relocation of modules on the device.

...read moreread less

Abstract: We present a new concept as well as the implementation of an FPGA based reconfigurable platform, the Erlangen slot machine (ESM). One main advantage of this platform is the possibility for each module to access its periphery independent from its location through a programmable crossbar, allowing an unrestricted relocation of modules on the device. Furthermore, we propose different intermodule communication structures.

...read moreread less

Patent•

Systems and methods for implementing a vehicle control and interconnection system

[...]

Joseph Gormley

23 Sep 2005

TL;DR: In this article, a method of developing peripherals for integration with a vehicle control system comprises providing a vehicle controller and interconnection system that includes a system core for processing data, an input module and an output module.

...read moreread less

Abstract: A method of developing peripherals for integration with a vehicle control system comprises providing a vehicle control and interconnection system that includes a system core for processing data, an input module and an output module. The system core includes a reconfigurable space having reconfigurable hardware, memory and a supervising processor that is customized to the order. The supervising processor is configured to provide control information to identified peripherals and control the allocation and configuration of the reconfigurable space into a plurality of independent information processing workspaces. The associated information processing workspace for the peripherals is configured if required, a verifying operation of the peripherals with the control and interconnection system is performed and the peripherals are authorized as approved peripherals. To integrate the peripherals into the system, design tools assist the developer in configuring an associated information processing workspace, setting up operating conditions or performing other integration tasks.

...read moreread less

Collapse