Showing papers on "Field-programmable gate array published in 2008"

PDF

Open Access

Patent•

Method for Protecting Intellectual Property Cores on Field Programmable Gate Array

[...]

02 Jul 2008

TL;DR: In this paper, the authors present techniques to protect intellectual property cores on field programmable gate arrays (FPGAs) by associating each FPGA with a secret key, which is used to charge a customer per-use or per-configuration of their intellectual property.

...read moreread less

Abstract: Techniques are used to protect intellectual property cores on field programmable gate arrays. An approach is to associate each field programmable gate array, or a limited number of field programmable gate arrays, with a secret key. Each field programmable gate array may only be properly configured or programmed by an appropriate encrypted bitstream (which includes one or more intellectual property cores). This encrypted bitstream has been encoded by or for the secret key associated with a particular FPGA. Other techniques are also presented in this application and include network-based, nonnetwork-based, software-based, layered, and other approaches. The techniques allow an intellectual property core vendor to charge a customer per-use or per-configuration of their intellectual property. This is because an encrypted bitstream is useable only in a limited number, possibly just one, of the integrated circuits.

...read moreread less

218 citations

Monograph•DOI•

FPGA-based Implementation of Signal Processing Systems

[...]

Roger Woods¹, John McAllister¹, R. Turner, Ying Yi², Gaye Lightbody - Show less +1 more•Institutions (2)

Queen's University Belfast¹, University of Edinburgh²

10 Dec 2008

TL;DR: FPGA-based Implementation of Signal Processing Systems is an important reference for practising engineers and researchers working on the design and development of DSP systems for radio, telecommunication, information, audio-visual and security applications.

...read moreread less

Abstract: Field programmable gate arrays (FPGAs) are an increasingly popular technology for implementing digital signal processing (DSP) systems. By allowing designers to create circuit architectures developed for the specific applications, high levels of performance can be achieved for many DSP applications providing considerable improvements over conventional microprocessor and dedicated DSP processor solutions. The book addresses the key issue in this process specifically, the methods and tools needed for the design, optimization and implementation of DSP systems in programmable FPGA hardware. It presents a review of the leading-edge techniques in this field, analyzing advanced DSP-based design flows for both signal flow graph- (SFG-) based and dataflow-based implementation, system on chip (SoC) aspects, and future trends and challenges for FPGAs. The automation of the techniques for component architectural synthesis, computational models, and the reduction of energy consumption to help improve FPGA performance, are given in detail. Written from a system level design perspective and with a DSP focus, the authors present many practical application examples of complex DSP implementation, involving: high-performance computing e.g. matrix operations such as matrix multiplication; high-speed filtering including finite impulse response (FIR) filters and wave digital filters (WDFs); adaptive filtering e.g. recursive least squares (RLS) filtering; transforms such as the fast Fourier transform (FFT). FPGA-based Implementation of Signal Processing Systems is an important reference for practising engineers and researchers working on the design and development of DSP systems for radio, telecommunication, information, audio-visual and security applications. Senior level electrical and computer engineering graduates taking courses in signal processing or digital signal processing shall also find this volume of interest.

...read moreread less

215 citations

Journal Article•DOI•

A Parallel Hardware Architecture for Scale and Rotation Invariant Feature Detection

[...]

Vanderlei Bonato¹, Eduardo Marques¹, George A. Constantinides²•Institutions (2)

University of São Paulo¹, Imperial College London²

01 Dec 2008-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The achieved system performance is at least one order of magnitude better than a PC-based solution, a result achieved by investigating the impact of several hardware-orientated optimizations on performance, area and accuracy.

...read moreread less

Abstract: This paper proposes a parallel hardware architecture for image feature detection based on the scale invariant feature transform algorithm and applied to the simultaneous localization and mapping problem. The work also proposes specific hardware optimizations considered fundamental to embed such a robotic control system on-a-chip. The proposed architecture is completely stand-alone; it reads the input data directly from a CMOS image sensor and provides the results via a field-programmable gate array coupled to an embedded processor. The results may either be used directly in an on-chip application or accessed through an Ethernet connection. The system is able to detect features up to 30 frames per second (320times240 pixels) and has accuracy similar to a PC-based implementation. The achieved system performance is at least one order of magnitude better than a PC-based solution, a result achieved by investigating the impact of several hardware-orientated optimizations on performance, area and accuracy.

...read moreread less

198 citations

Journal Article•DOI•

FPGA Realization of FIR Filters by Efficient and Flexible Systolization Using Distributed Arithmetic

[...]

Pramod Kumar Meher¹, S. Chandrasekaran², Abbes Amira²•Institutions (2)

Nanyang Technological University¹, University College West²

01 Jul 2008-IEEE Transactions on Signal Processing

TL;DR: The systolic decomposition scheme is found to offer a flexible choice of the address length of the lookup tables (LUT) for DA-based computation to decide on suitable area time tradeoff, and the choice of address length yields the best of area-delay-power-efficient realizations of the FIR filter for various filter orders.

...read moreread less

Abstract: In this paper, we present the design optimization of one- and two-dimensional fully pipelined computing structures for area-delay-power-efficient implementation of finite-impulse-response (FIR) filter by systolic decomposition of distributed arithmetic (DA)-based inner-product computation. The systolic decomposition scheme is found to offer a flexible choice of the address length of the lookup tables (LUT) for DA-based computation to decide on suitable area time tradeoff. It is observed that by using smaller address lengths for DA-based computing units, it is possible to reduce the memory size, but on the other hand that leads to increase of adder complexity and the latency. For efficient DA-based realization of FIR filters of different orders, the flexible linear systolic design is implemented on a Xilinx Virtex-E XCV2000E FPGA using a hybrid combination of Handel-C and parameterizable VHDL cores. Various key performance metrics such as number of slices, maximum usable frequency, dynamic power consumption, energy density, and energy throughput are estimated for different filter orders and address lengths. Analysis of the results obtained indicate that performance metrics of the proposed implementation is broadly in line with theoretical expectations. It is found that the choice of address length yields the best of area-delay-power-efficient realizations of the FIR filter for various filter orders. Moreover, the proposed FPGA implementation is found to involve significantly less area-delay complexity compared with the existing DA-based implementations of FIR filter.

...read moreread less

194 citations

Proceedings Article•DOI•

From the bitstream to the netlist

[...]

Éric Rannaud¹•Institutions (1)

École Normale Supérieure¹

24 Feb 2008

TL;DR: This work aims to raise awareness about security issues for users of FPGAs and makes custom compilation and low-level tinkering with bitstreams - à la JBits - possible.

...read moreread less

Abstract: This poster presents an in-depth analysis of the Xilinx bitstream file format. This theoretical analysis is backed by a simple and efficient implementation of a reverse-engineering tool for Xilinx bitstreams. The development process followed these lines. First, publicly available documentation from Xilinx has been analyzed; then some custom assumptions about the bitstream format have been made. This information allowed a suitable algorithm to be run on well-chosen bitstreams. The output from this automated analysis step is a database which relates raw bitstream data to low-level netlist elements. This database is subsequently used as input to an efficient bitstream compiler which can either generate a bitstream from a low-level (XDL) description of the netlist, or conversely decompile any given bitstream to its low-level netlist elements. This work has been validated for the spartan3, virtex2, virtex4 and virtex5 FPGA lines from Xilinx. Decompiling a bitstream is very fast; it is two orders of magnitude faster than the reverse operation of compilation with Xilinx' bitgen. This work aims to raise awareness about security issues for users of FPGAs. It also makes custom compilation and low-level tinkering with bitstreams - a la JBits - possible

...read moreread less

182 citations

Journal Article•DOI•

Fast Elliptic Curve Cryptography on FPGA

[...]

W.N. Chelton¹, Mohammed Benaissa¹•Institutions (1)

University of Sheffield¹

01 Feb 2008-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper details the design of a new high-speed pipelined application-specific instruction set processor (ASIP) for elliptic curve cryptography (ECC) using field-programmable gate-array (FPGA) technology.

...read moreread less

Abstract: This paper details the design of a new high-speed pipelined application-specific instruction set processor (ASIP) for elliptic curve cryptography (ECC) using field-programmable gate-array (FPGA) technology. Different levels of pipelining were applied to the data path to explore the resulting performances and find an optimal pipeline depth. Three complex instructions were used to reduce the latency by reducing the overall number of instructions, and a new combined algorithm was developed to perform point doubling and point addition using the application specific instructions. An implementation for the United States Government National Institute of Standards and Technology-recommended curve over GF(2163) is shown, which achieves a point multiplication time of 33.05 s at 91 MHz on a Xilinx Virtex-E FPGA-the fastest figure reported in the literature to date. Using the more modern Xilinx Virtex-4 technology, a point multiplication time of 19.55 s was achieved, which translates to over 51120 point multiplications per second.

...read moreread less

167 citations

Patent•

System and method for hardware-software multitasking on a reconfigurable computing platform

[...]

Vincent Nollet, Paul Coene, Jean-Yves Mignolet, Serge Vernalde, Diederik Verkest, Theodore Marescaux, Andrei Bartic - Show less +3 more

10 Dec 2008

TL;DR: In this paper, a platform supporting reconfigurable computing, enabling the introduction of reconfigable hardware into portable devices is described, which is a heterogeneous multi-processor platform, containing one or more instruction set processors (ISP) and a recon-figurable matrix (for instance a gate array, especially an FPGA).

...read moreread less

Abstract: A platform supporting reconfigurable computing, enabling the introduction of reconfigurable hardware into portable devices is described. Dynamic hardware/software multitasking methods for a reconfigurable computing platform including reconfigurable hardware devices such as gate arrays, especially FPGA's, and software, such as dedicated hardware/software operating systems and middleware, adapted for supporting the methods, especially multitasking, are described. A computing platform, which is a heterogeneous multi-processor platform, containing one or more instruction set processors (ISP) and a reconfigurable matrix (for instance a gate array, especially an FPGA), adapted for (dynamic) hardware/software multitasking is described.

...read moreread less

147 citations

Journal Article•DOI•

The Reconfigurable Instruction Cell Array

[...]

S. Khawam, I. Nousias, M. Milward, Ying Yi¹, M. Muir, Tughrul Arslan - Show less +2 more•Institutions (1)

University of Edinburgh¹

01 Jan 2008-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: Results show that the reconfigurable instruction cell array delivers considerably less power consumption when compared to leading VLIW and low-power digital signal processors, but still maintaining their throughput performance.

...read moreread less

Abstract: This paper presents a novel instruction cell-based reconfigurable computing architecture for low-power applications, thereafter referred to as the reconfigurable instruction cell array (RICA). For the development of the RICA, a top-down software driven approach was taken and revealed as one of the key design decisions for a flexible, easy to program, low-power architecture. These features make RICA an architecture that inherently solves the main design requirements of modern low-power devices. Results show that it delivers considerably less power consumption when compared to leading VLIW and low-power digital signal processors, but still maintaining their throughput performance.

...read moreread less

145 citations

Proceedings Article•DOI•

Lightweight secure PUFs

[...]

Majzoobi, Koushanfar, Potkonjak

01 Jan 2008

143 citations

Journal Article•

Fast cone-beam CT image reconstruction using GPU hardware

[...]

Guorui Yan, Jie Tian, Shouping Zhu, Yakang Dai, Chenghu Qin - Show less +1 more

01 Jan 2008-Journal of X-ray Science and Technology

TL;DR: This paper implements Feldkamp-Davis-Kress (FDK) algorithm on commodity GPU using an acceleration scheme that saves the copy time, and the combination of z-axis symmetry and multiple render targets (MRTs) reduces the computational cost on the geometry mapping between slices to be reconstructed and projection views.

...read moreread less

Abstract: Three dimension Computed Tomography (CT) reconstruction is computationally demanding To accelerate the speed of reconstruction, Application Specific Integrated Circuit (ASIC) or Field Programmable Gate Array (FPGA) has been used, but they are expensive, inflexible and not easy to upgrade The modern Graphics Processing Unit (GPU) with its programmable features improves this situation and becomes one of the powerful and flexible tools for 3D CT reconstruction In this paper, we implement Feldkamp-Davis-Kress (FDK) algorithm on commodity GPU using an acceleration scheme In the scheme, two techniques are developed and combined One is cyclic render-to-texture (CRTT) which saves the copy time, and the other is the combination of z-axis symmetry and multiple render targets (MRTs), which reduces the computational cost on the geometry mapping between slices to be reconstructed and projection views Our algorithm performs reconstruction of a 5123 volume from 360 views of the size 512 x 512 about 52s on a single NVIDIA GeForce 8800GTX card

...read moreread less

138 citations

Journal Article•DOI•

Comparison of the FPGA Implementation of Two Multilevel Space Vector PWM Algorithms

[...]

Oscar Lopez¹, J. Alvarez¹, Jesus Doval-Gandoy¹, Francisco D. Freijedo¹, A. Nogueiras¹, A. Lago¹, C.M. Penalver¹ - Show less +3 more•Institutions (1)

University of Vigo¹

04 Apr 2008-IEEE Transactions on Industrial Electronics

TL;DR: Two algorithms, 2-D and 3-D, are analyzed and implemented in an FPGA and both implementations are compared in terms of implementation complexity and logic resources required.

...read moreread less

Abstract: Multilevel converters can meet the increasing demand of power ratings and power quality associated with reduced harmonic distortion and lower electromagnetic interference. When the number of levels increases, it is necessary to control more and more switches in parallel. Field programmable gate arrays (FPGAs), with their concurrent processing capability, are suitable for the implementation of multilevel modulation algorithms. Among them, space vector pulsewidth modulation algorithms offer great flexibility to optimize switching waveforms and are well suited for digital implementation. In this paper, two algorithms, 2-D and 3-D, are analyzed and implemented in an FPGA. In order to carry out the implementation, both algorithms have been described in very high speed integrated circuit hardware description language, partly hand coded, and partly automatically generated using the system generator tool. Both implementations are compared in terms of implementation complexity and logic resources required. Finally, test results with a neutral-point-clamped inverter are presented.

...read moreread less

Book•

FPGA prototyping by VHDL examples

[...]

Pong P. Chu

01 Jan 2008

TL;DR: FPGA Prototyping by VHDL Examples is an indispensable companion text for introductory digital design courses and also serves as a valuable self-teaching guide for practicing engineers who wish to learn more about this emerging area of interest.

...read moreread less

Abstract: A hands-on introduction to VHDL synthesis and FPGA prototyping Hardware Descriptive Language (HDL) and Field Programmable Gate Array (FPGA) devices allow designers to quickly develop and simulate a sophisticated digital circuit, realize it on a prototyping device, and verify the operation of its physical implementation. As these technologies have matured, they have become accepted mainstream practice so that it is possible to use a PC and an inexpensive FPGA prototyping board to construct a complex digital system. This book uses a "learn by doing" approach to introduce the concepts and techniques of VHDL and FPGA to designers through a series of hands-on experiments. FPGA Prototyping by VHDL Examples provides: A collection of clear, easy-to-follow templates for quick code development A large number of practical examples to illustrate and reinforce the concepts and design techniques Realistic projects that can be implemented and tested on a Xilinx prototyping board A thorough exploration of the Xilinx PicoBlaze soft-core microcontroller Although the book is an introductory text, the examples are developed in a rigorous manner and the derivations follow strict design guidelines and coding practices used for large, complex systems. It lays a solid foundation for students and new engineers and prepares them for future development tasks. FPGA Prototyping by VHDL Examples is an indispensable companion text for introductory digital design courses and also serves as a valuable self-teaching guide for practicing engineers who wish to learn more about this emerging area of interest.

...read moreread less

Journal Article•DOI•

NetFPGA—An Open Platform for Teaching How to Build Gigabit-Rate Network Switches and Routers

[...]

Glen Gibb¹, John W. Lockwood¹, Jad Naous¹, P. Hartke¹, Nick McKeown¹ - Show less +1 more•Institutions (1)

Stanford University¹

01 Aug 2008-IEEE Transactions on Education

TL;DR: A new version of the NetFPGA platform has been developed and is available for use by the academic community, with modular interfaces that enable development of complex hardware designs by integration of simple building blocks.

...read moreread less

Abstract: The NetFPGA platform enables students and researchers to build high-performance networking systems using field-programmable gate array (FPGA) hardware. A new version of the NetFPGA platform has been developed and is available for use by the academic community. The NetFPGA platform has modular interfaces that enable development of complex hardware designs by integration of simple building blocks. FPGA logic is used to implement the core data processing functions while software running on an attached host computer or embedded cores within the device implement control functions. Reference designs and component libraries have been developed for the CS344 course at Stanford University, Stanford, CA, and taught at a series of tutorials held in the United States, United Kingdom, India, China, Australia, and Europe. The open-source Verilog, C, Perl, and Java reference design is available for download from the project website.

...read moreread less

Journal Article•DOI•

Steady-State and Dynamic Study of Active Power Filter With Efficient FPGA-Based Control Algorithm

[...]

Zeliang Shu¹, Yuhua Guo¹, Jisan Lian¹•Institutions (1)

Southwest Jiaotong University¹

04 Apr 2008-IEEE Transactions on Industrial Electronics

TL;DR: A new approach using field-programmable gate array (FPGA) to implement a fully digital control algorithm of active power filter (APF) is proposed in this paper, and experimental results on a laboratory prototype are given to demonstrate performance of the proposed approach during steady-state and dynamic operations.

...read moreread less

Abstract: A new approach using field-programmable gate array (FPGA) to implement a fully digital control algorithm of active power filter (APF) is proposed in this paper. This FPGA-based controller integrates the whole signal-processing function of an APF, including synchronous-reference-frame transform, low-pass filter, three-phase phase-locked loop, inverter-current controller, etc. By case studies on the principle, performance, and architecture, these control blocks are implemented in real-time and synthesized into a medium-scale FPGA chip by adopting some useful digital-signal-processing techniques, such as pipelining, folding and strength reduction, with respect to minimization of hardware resource and enhancement of operating frequency. As a result, the whole algorithm needs around 5000 logic elements and can run at synchronous system-clock rates of up to 65 MHz. Experimental results on a laboratory prototype are given to demonstrate performance of the proposed approach during steady-state and dynamic operations.

...read moreread less

Journal Article•DOI•

The Modular Multisensory DLR-HIT-Hand: Hardware and Software Architecture

[...]

Hong Liu, P. Meusel, Gerd Hirzinger, Minghe Jin, Yiwei Liu, Zongwu Xie - Show less +2 more

01 Aug 2008-IEEE-ASME Transactions on Mechatronics

TL;DR: A hierarchical software structure has been established to perform all data processing and the control of the hand and provides basic air position indicator functions and skills to access all hardware resources for data acquisition, computation, and teleoperation.

...read moreread less

Abstract: This paper presents hardware and software architecture of the newly developed compact multisensory German aerospace research (DLR)-Harbin Institute of Technology (HIT)-Hand. The hand has four identical fingers and an extra degree of freedom for the palm. In each finger, there is a field-programmable gate array (FPGA) for data collection, brushless dc motors for control, and communication is accomplished with palm's FPGA by point-to-point serial communication (PPSeCo). The kernel of the hardware system is a peripheral component interconnect (PCI)- based high-speed floating-point DSP for data processing, and FPGA for high-speed (up to 25 Mb/s) real-time serial communication with the palm's FPGA. In order to achieve high modularity and reliability of the hand, a fully mechatronic integration and analog signals in situ digitalization philosophy is implemented to minimize the dimension and number of the cables (five cables including power supply), and protect data communication from outside disturbances. Furthermore, according to the hardware structure of the hand, a hierarchical software structure has been established to perform all data processing and the control of the hand. It provides basic air position indicator (API) functions and skills to access all hardware resources for data acquisition, computation, and teleoperation. With the nice design of the hand's envelop, the hand looks more like a humanoid.

...read moreread less

Journal Article•DOI•

A Stochastic-Based FPGA Controller for an Induction Motor Drive With Integrated Neural Network Algorithms

[...]

Da Zhang, Hui Li¹•Institutions (1)

Florida A&M University¹

31 Jan 2008-IEEE Transactions on Industrial Electronics

TL;DR: A stochastic NN structure is proposed in this paper for an FPGA implementation of a feedforward NN to estimate the feedback signals in an induction motor drive and significantly reduces the number of logic gates required for the proposed NN estimator.

...read moreread less

Abstract: This paper applies stochastic theory to the design and implementation of field-oriented control of an induction motor drive using a single field-programmable gate array (FPGA) device and integrated neural network (NN) algorithms. Normally, NNs are characterized as heavily parallel calculation algorithms that employ enormous computational resources and are less useful for economical digital hardware implementations. A stochastic NN structure is proposed in this paper for an FPGA implementation of a feedforward NN to estimate the feedback signals in an induction motor drive. The stochastic arithmetic simplifies the computational elements of the NN and significantly reduces the number of logic gates required for the proposed NN estimator. A new stochastic proportional-integral speed controller is also developed with antiwindup functionality. Compared with conventional digital controls for motor drives, the proposed stochastic-based algorithm enhances the arithmetic operations of the FPGA, saves digital resources, and permits the NN algorithms and classical control algorithms to be easily interfaced and implemented on a single low-complexity, inexpensive FPGA. The algorithm has been realized using a single FPGA XC3S400 from Xilinx, Inc. A hardware-in-the-loop (HIL) test platform using a Real Time Digital Simulator is built in the laboratory. The HIL experimental results are provided to verify the proposed FPGA controller.

...read moreread less

Proceedings Article•DOI•

VESPA: portable, scalable, and flexible FPGA-based vector processors

[...]

Peter Yiannacouras¹, J. Gregory Steffan¹, Jonathan Rose¹•Institutions (1)

University of Toronto¹

19 Oct 2008

TL;DR: A system of vectorized software and soft vector processor hardware that is portable to any FPGA architecture and vector processor configuration, scalable to larger yet higher-performance designs, and flexible, allowing the underlying vector processor to be customized to match the needs of each application.

...read moreread less

Abstract: While soft processors are increasingly common in FPGA-based embedded systems, it remains a challenge to scale their performance. We propose extending soft processor instruction sets to include support for vector processing. The resulting system of vectorized software and soft vector processor hardware is (i) portable to any FPGA architecture and vector processor configuration, (ii) scalable to larger yet higher-performance designs, and (iii) flexible, allowing the underlying vector processor to be customized to match the needs of each application. Using our robust and verified parameterized vector processor design and industry-standard EEMBC benchmarks, we evaluate the performance and area trade-offs for different soft vector processor configurations using an FPGA development platform with DDR SDRAM. We find that on average we can scale performance from 1.8x up to 6.3x for a vector processor design that saturates the capacity of our platform's Stratix 1S80 FPGA. We also automatically generate application-specific vector processors with reduced datapath width and instruction set support which combined reduce the area by up to 70% (61% on average) without affecting performance.

...read moreread less

Book•

Digital Systems Design with FPGAs and CPLDs

[...]

Ian Grout

01 Jan 2008

TL;DR: This book will be ideal for electronic and computer engineering students taking a practical or Lab based course on digital systems development using PLDs and for engineers in industry looking for concrete advice on developing a digital system using a FPGA or CPLD as its core.

...read moreread less

Abstract: This textbook explains how to design and develop digital electronic systems using programmable logic devices (PLDs). Totally practical in nature, the book features numerous (quantify when known) case study designs using a variety of Field Programmable Gate Array (FPGA) and Complex Programmable Logic Devices (CPLD), for a range of applications from control and instrumentation to semiconductor automatic test equipment.Key features include:* Case studies that provide a walk through of the design process, highlighting the trade-offs involved.* Discussion of real world issues such as choice of device, pin-out, power supply, power supply decoupling, signal integrity- for embedding FPGAs within a PCB based design.With this book engineers will be able to:* Use PLD technology to develop digital and mixed signal electronic systems* Develop PLD based designs using both schematic capture and VHDL synthesis techniques* Interface a PLD to digital and mixed-signal systems* Undertake complete design exercises from design concept through to the build and test of PLD based electronic hardwareThis book will be ideal for electronic and computer engineering students taking a practical or Lab based course on digital systems development using PLDs and for engineers in industry looking for concrete advice on developing a digital system using a FPGA or CPLD as its core. *Case studies that provide a walk through of the design process, highlighting the trade-offs involved.*Discussion of real world issues such as choice of device, pin-out, power supply, power supply decoupling, signal integrity- for embedding FPGAs within a PCB based design.

...read moreread less

Proceedings Article•DOI•

Compact architecture for high-throughput regular expression matching on FPGA

[...]

Yi-Hua E. Yang¹, Weirong Jiang¹, Viktor K. Prasanna¹•Institutions (1)

University of Southern California¹

06 Nov 2008

TL;DR: The proposed REM architecture, based on nondeterministic finite automaton (RE-NFA), efficiently constructs regular expression matching engines (REME) of arbitrary regular patterns and character classes in a uniform structure, utilizing both logic slices and block memory available on modern FPGA devices.

...read moreread less

Abstract: In this paper we present a novel architecture for high-speed and high-capacity regular expression matching (REM) on FPGA. The proposed REM architecture, based on nondeterministic finite automaton (RE-NFA), efficiently constructs regular expression matching engines (REME) of arbitrary regular patterns and character classes in a uniform structure, utilizing both logic slices and block memory (BRAM) available on modern FPGA devices. The resulting circuits take advantage of synthesis and routing optimizations to achieve high operating speed and area efficiency. The uniform structure of our RE-NFA design can be stacked in a simple way to produce multi-character input circuits to scale up throughput further. An n-state m-character input REME takes only O (n X log2m) time to construct and occupies no more than O (n X m) logic units. The REMEs can be staged and pipelined in large numbers to achieve high parallelism without sacrificing clock frequency.Using the proposed RE-NFA architecture, we are able to implement 3 copies of two-character input REMEs, each with 760 regular expressions, 18715 states and 371 character classes, onto a single Xilinx Virtex 4 LX-100-12 device. Each copy processes 2 characters per clock cycle at 300 MHz, resulting in a concurrent throughput of 14.4 Gbps for 760 REMEs. Compared with the automatic NFA-to-VHDL REME compilation [13], our approach achieves over 9x throughput efficiency (Gbps*state/LUT). Compared with state-of-the-art REMEs on FPGA, our approach also indicates up to 70% better throughput efficiency.

...read moreread less

Journal Article•DOI•

Embedded Model Predictive Control (MPC) using a FPGA

[...]

Keck Voon Ling¹, Bing Fang Wu¹, Jan M. Maciejowski²•Institutions (2)

Nanyang Technological University¹, University of Cambridge²

01 Jan 2008-IFAC Proceedings Volumes

TL;DR: Simulation studies on a realistic example show that it is possible to implement constrained MPC on an FPGA chip with a 25MHz clock and achieve MPC implementation rates comparable to those achievable on a Pentium 3.0 GHz PC.

...read moreread less

Book Chapter•DOI•

Implementation of the AES-128 on virtex-5 FPGAs

[...]

Philippe Bulens¹, François-Xavier Standaert¹, Jean-Jacques Quisquater¹, Pascal Pellegrin¹, Gael Rouvroy¹ - Show less +1 more•Institutions (1)

Université catholique de Louvain¹

11 Jun 2008

TL;DR: An updated implementation of the Advanced Encryption Standard (AES) on the recent Xilinx Virtex-5 FPGAs is presented, showing how a modified slice structure in these reconfigurable hardware devices results in significant improvement of the design efficiency.

...read moreread less

Abstract: This paper presents an updated implementation of the Advanced Encryption Standard (AES) on the recent Xilinx Virtex-5 FPGAs. We show how a modified slice structure in these reconfigurable hardware devices results in significant improvement of the design efficiency. In particular, a single substitution box of the AES can fit in 8 FPGA slices. We combine these technological changes with a sound intertwining of the round and key round functionalities in order to produce encryption and decryption architectures that perfectly fit with the Digital Cinema Initiative specifications. More generally, our implementations are convenient for any application requiring Gbps-range throughput.

...read moreread less

Proceedings Article•DOI•

A multi-platform controller allowing for maximum Dynamic Partial Reconfiguration throughput

[...]

Christopher Claus, B. Zhang, Walter Stechele, Lars Braun¹, Michael Hübner¹, Jürgen Becker¹ - Show less +2 more•Institutions (1)

Karlsruhe Institute of Technology¹

23 Sep 2008

TL;DR: This paper addresses problems, limitations and results of on- chip reconfiguration that enable the user to decide whether DPR is suitable for a certain design prior to its implementation and presents an IP core that enables fast on-chip DPR close to the maximum achievable speed.

...read moreread less

Abstract: Dynamic and partial reconfiguration (DPR) is a special feature offered by Xilinx Field Programmable Gate Arrays (FPGAs), giving the designer the ability to reconfigure a certain portion of the FPGA during run-time without influencing the other parts. This feature allows the hardware to be adaptable to any potential situation. For some applications, such as video-based driver assistance, the time needed to exchange a certain portion of the device might be critical. This paper addresses problems, limitations and results of on-chip reconfiguration that enable the user to decide whether DPR is suitable for a certain design prior to its implementation. A method is therefore introduced to calculate the expected reconfiguration throughput and latency. In addition, an IP core is presented that enables fast on-chip DPR close to the maximum achievable speed. Compared to an alternative state-of-the art realization, an increase in speed by a factor of 58 can be obtained.

...read moreread less

Volatile FPGA design security { a survey

[...]

Saar Drimer

01 Jan 2008

TL;DR: This survey establishes the foundations for discussing FPGAs security, examines a wide range of attacks and defenses along with the current state of industry oerings, and outlines on-going research and latest developments.

...read moreread less

Abstract: Volatile FPGAs, the dominant type of programmable logic devices, are used in space, military, automotive, and consumer electronics applications which require them to operate in a wide range of environments. The continuous growth in both their capability and capacity now requires signicant resources to be invested in the designs that are created for them. This has brought increased interest in the security attributes of FPGAs; specically, how well do they protect the information processed within it, how are designs protected during distribution, and how developers’ ownership rights are protected while designs from multiple sources are combined. This survey establishes the foundations for discussing \FPGA security", examines a wide range of attacks and defenses along with the current state of industry oerings, and nally, outlines on-going research and latest developments.

...read moreread less

Journal Article•DOI•

Fine-Grain SEU Mitigation for FPGAs Using Partial TMR

[...]

Pratt, Caffrey, Carroll, Graham, Morgan, Wirthlin - Show less +2 more

01 Jan 2008-IEEE Transactions on Nuclear Science

Proceedings Article•DOI•

Fault tolerant methods for reliability in FPGAs

[...]

Edward Stott¹, P. Sedcole¹, Peter Y. K. Cheung¹•Institutions (1)

Imperial College London¹

23 Sep 2008

TL;DR: This paper provides the first comprehensive survey of fault detection methods and fault tolerance schemes specifically for FPGAs, with the goal of laying a strong foundation for future research in this field.

...read moreread less

Abstract: Reliability and process variability are serious issues for FPGAs in the future. Fortunately FPGAs have the ability to reconfigure in the field and at runtime, thus providing opportunities to overcome some of these issues. This paper provides the first comprehensive survey of fault detection methods and fault tolerance schemes specifically for FPGAs, with the goal of laying a strong foundation for future research in this field. All methods and schemes are qualitatively compared and some particularly promising approaches highlighted.

...read moreread less

Journal Article•DOI•

A Scalable Correlator Architecture Based on Modular FPGA Hardware, Reuseable Gateware, and Data Packetization

[...]

Aaron R. Parsons¹, Donald C. Backer, Andrew Siemion, Henry Chen, Dan Werthimer, Pierre Droz, Terry Filiba, Jason Manley, Peter L. McMahon, Arash Parsa, David MacMahon, Melvyn Wright - Show less +8 more•Institutions (1)

University of California, Berkeley¹

13 Oct 2008-Publications of the Astronomical Society of the Pacific

TL;DR: A general-purpose correlator architecture using standard 10-Gbit Ethernet switches to pass data between flexible hardware modules containing Field Programmable Gate Array (FPGA) chips that are programmed using open-source signal-processing libraries that are designed to be flexible, scalable, and chip-independent.

...read moreread less

Abstract: . A new generation of radio telescopes is achieving unprecedented levels of sensitivity and resolution, as well as increased agility and field of view, by employing high-performance digital signal-processing hardware to phase and correlate signals from large numbers of antennas. The computational demands of these imaging systems scale in proportion to BMN2 B M N 2 , where B B is the signal bandwidth, M M is the number of independent beams, and N N is the number of antennas. The specifications of many new arrays lead to demands in excess of tens of PetaOps per second. To meet this challenge, we have developed a general-purpose correlator architecture using standard 10-Gbit Ethernet switches to pass data between flexible hardware modules containing Field Programmable Gate Array (FPGA) chips. These chips are programmed using open-source signal-processing libraries that we have developed to be flexible, scalable, and chip-independent. This work reduces the time and cost of implementing a wide range of...

...read moreread less

Journal Article•DOI•

Warp Processing: Dynamic Translation of Binaries to FPGA Circuits

[...]

Frank Vahid¹, Greg Stitt², Roman Lysecky³•Institutions (3)

University of California, Riverside¹, University of Florida², University of Arizona³

01 Jul 2008-IEEE Computer

TL;DR: A new architecture and set of dynamic CAD tools demonstrate warp processing's potential, resulting in 2X to 100X speedup over executing on microprocessors.

...read moreread less

Abstract: Warp processing dynamically and transparently transforms an executing microprocessor's binary kernels into customized field-programmable gate array (FPGA) circuits, commonly resulting in 2X to 100X speedup over executing on microprocessors. A new architecture and set of dynamic CAD tools demonstrate warp processing's potential.

...read moreread less

Journal Article•DOI•

2008 Special Issue: Compact hardware liquid state machines on FPGA for real-time speech recognition

[...]

Benjamin Schrauwen¹, Michiel D'Haene¹, David Verstraeten¹, Jan Van Campenhout¹•Institutions (1)

Ghent University¹

01 Mar 2008-Neural Networks

TL;DR: In this paper, the authors present an application driven hardware exploration where they implement real-time, isolated digit speech recognition using a Liquid State Machine, a recurrent neural network of spiking neurons where only the output layer is trained.

...read moreread less

Journal Article•DOI•

FPGA-Based Predictive Current Controllerfor Synchronous Machine Speed Drive

[...]

Mohamed Wissem Naouar, A.A. Naassani, Eric Monmasson, I. Slama Belkhodja

09 Jul 2008-IEEE Transactions on Power Electronics

TL;DR: A field programmable gate array (FPGA)-based speed controller for a synchronous machine with an internal current control loop based on a predictive current controller is presented and Experimental results are shown to prove the efficiency of FPGA-based solutions to achieve high performances.

...read moreread less

Abstract: In this paper, a field programmable gate array (FPGA)-based speed controller for a synchronous machine with an internal current control loop based on a predictive current controller is presented. Due to their complex computation schemes, predictive current controllers implemented in a full digital system are characterized by an inevitable delay in calculating and applying the switching states to the inverter. Consequently, their performances are affected and the achieved sampling frequency is limited. These digital control limitations are mainly due to the processing speed versus computational complexity trade-off. To cope with this problem, specific digital hardware technology such as FPGA can be used as an alternative digital solution to ensure fast processing operation and to preserve performances of predictive current controllers in spite of their complex computation schemes. Such performances can be preserved thanks to the high flexibility and high computation capabilities of FPGAs. In order to illustrate this, an FPGA implementation of a synchronous machine speed controller based on a predictive current controller is presented and fully analyzed in this work. The obtained execution time is only of few microseconds for the whole control algorithm. Experimental results are shown to prove the efficiency of FPGA-based solutions to achieve high performances.

...read moreread less

Proceedings Article•

Compact hardware liquid state machines on FPGA for real-time speech recognition

[...]

Benjamin Schrauwen, Michiel D'Haene, David Verstraeten, Jan Van Campenhout

01 Jan 2008

TL;DR: This work presents an application driven digital hardware exploration where real-time, isolated digit speech recognition is implemented using a Liquid State Machine, a recurrent neural network of spiking neurons where only the output layer is trained.

...read moreread less

Abstract: Hardware implementations of Spiking Neural Networks are numerous because they are well suited for implementation in digital and analog hardware, and outperform classic neural networks. This work presents an application driven digital hardware exploration where we implement real-time, isolated digit speech recognition using a Liquid State Machine. The Liquid State Machine is a recurrent neural network of spiking neurons where only the output layer is trained. First we test two existing hardware architectures which we improve and extend, but that appears to be too fast and thus area consuming for this application. Next, we present a scalable, serialized architecture that allows a very compact implementation of spiking neural networks that is still fast enough for real-time processing. All architectures support leaky integrate-and-fire membranes with exponential synaptic models. This work shows that there is actually a large hardware design space of Spiking Neural Network hardware that can be explored. Existing architectures have only spanned part of it.

...read moreread less

Collapse