scispace - formally typeset
Open AccessJournal ArticleDOI

Reconfigurable computing: a survey of systems and software

Reads0
Chats0
TLDR
The hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling are explored, and the software that targets these machines is focused on.
Abstract
Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solution. In this survey, we explore the hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling. We also focus on the software that targets these machines, such as compilation tools that map high-level algorithms directly to the reconfigurable substrate. Finally, we consider the issues involved in run-time reconfigurable systems, which reuse the configurable hardware during program execution.

read more

Content maybe subject to copyright    Report

Reconfigurable Computing: A Survey of Systems and Software
KATHERINE COMPTON
Northwestern University
AND
SCOTT HAUCK
University of Washington
Due to its potential to greatly accelerate a wide variety of applications, reconfigurable
computing has become a subject of a great deal of research. Its key feature is the ability
to perform computations in hardware to increase performance, while retaining much of
the flexibility of a software solution. In this survey, we explore the hardware aspects of
reconfigurable computing machines, from single chip architectures to multi-chip
systems, including internal structures and external coupling. We also focus on the
software that targets these machines, such as compilation tools that map high-level
algorithms directly to the reconfigurable substrate. Finally, we consider the issues
involved in run-time reconfigurable systems, which reuse the configurable hardware
during program execution.
Categories and Subject Descriptors: A.1 [Introductory and Survey]; B.6.1 [Logic
Design]: Design Style—logic arrays; B.6.3 [Logic Design]: Design Aids; B.7.1
[Integrated Circuits]: Types and Design Styles—gate arrays
General Terms: Design, Performance
Additional Key Words and Phrases: Automatic design, field-programmable, FPGA,
manual design, reconfigurable architectures, reconfigurable computing, reconfigurable
systems
1. INTRODUCTION
There are two primary methods in con-
ventional computing for the execution
This research was supported in part by Motorola, Inc., DARPA, and NSF.
K. Compton was supported by an NSF fellowship.
S. Hauck was supported in part by an NSF CAREER award and a Sloan Research Fellowship.
Authors’ addresses: K. Compton, Department of Electrical and Computer Engineering, Northwestern Uni-
versity, 2145 Sheridan Road, Evanston, IL 60208-3118; e-mail: kati@ece.northwestern.edu; S. Hauck, De-
partment of Electrical Engineering, The University of Washington, Box 352500, Seattle, WA 98195; e-mail:
hauck@ee.washington.edu.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted
without fee provided that copies are not made or distributed for profit or direct commercial advantage and
that copies show this notice on the first page or initial screen of a display along with the full citation.
Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit
is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any compo-
nent of this work in other works requires prior specific permission and/or a fee. Permissions may be requested
from Publications Dept., ACM, Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or
permissions@acm.org.
c
2002 ACM 0360-0300/02/0600-0171 $5.00
of algorithms. The first is to use hard-
wired technology, either an Application
Specific Integrated Circuit (ASIC) or a
group of individual components forming a
ACM Computing Surveys, Vol. 34, No. 2, June 2002, pp. 171–210.

172 K. Compton and S. Hauck
board-level solution, to perform the oper-
ations in hardware. ASICs are designed
specifically to perform a given computa-
tion, and thus they are very fast and
efficient when executing the exact com-
putation for which they were designed.
However, the circuit cannot be altered af-
ter fabrication. This forces a redesign and
refabrication of the chip if any part of its
circuit requires modification. This is an ex-
pensive process, especially when one con-
siders the difficulties in replacing ASICs
in a large number of deployed systems.
Board-level circuits are also somewhat in-
flexible, frequently requiring a board re-
design and replacement in the event of
changes to the application.
The second method is to use soft-
ware-programmed microprocessors—a far
more flexible solution. Processors execute
a set of instructions to perform a compu-
tation. By changing the software instruc-
tions, the functionality of the system is
altered without changing the hardware.
However, the downside of this flexibility
is that the performance can suffer, if not
in clock speed then in work rate, and is
far below that of an ASIC. The processor
must read each instruction from memory,
decode its meaning, and only then exe-
cute it. This results in a high execution
overhead for each individual operation.
Additionally, the set of instructions that
may be used by a program is determined
at the fabrication time of the processor.
Any other operations that are to be im-
plemented must be built out of existing
instructions.
Reconfigurable computing is intended to
fill the gap between hardware and soft-
ware, achieving potentially much higher
performance than software, while main-
taining a higher level of flexibility than
hardware. Reconfigurable devices, in-
cluding field-programmable gate arrays
(FPGAs), contain an array of computa-
tional elements whose functionality is de-
termined through multiple programmable
configuration bits. These elements, some-
times known as logic blocks, are connected
using a set of routing resources that are
also programmable. In this way, custom
digital circuits can be mapped to the recon-
figurable hardware by computing the logic
functions of the circuit within the logic
blocks, and using the configurable routing
to connect the blocks together to form the
necessary circuit.
FPGAs and reconfigurable computing
have been shown to accelerate a variety of
applications. Data encryption, for exam-
ple, is able to leverage both parallelism
and fine-grained data manipulation. An
implementation of the Serpent Block
Cipher in the Xilinx Virtex XCV1000
shows a throughput increase by a factor
of over 18 compared to a Pentium Pro
PC running at 200 MHz [Elbirt and Paar
2000]. Additionally, a reconfigurable com-
puting implementation of sieving for fac-
toring large numbers (useful in breaking
encryption schemes) was accelerated by a
factor of 28 over a 200-MHz UltraSparc
workstation [Kim and Mangione-Smith
2000]. The Garp architecture shows a
comparable speed-up for DES [Hauser
and Wawrzynek 1997], as does an
FPGA implementation of an elliptic curve
cryptography application [Leung et al.
2000].
Other recent applications that have
been shown to exhibit significant speed-
ups using reconfigurable hardware
include: automatic target recognition
[Rencher and Hutchings 1997], string pat-
tern matching [Weinhardt and Luk 1999],
Golomb Ruler Derivation [Dollas et al.
1998; Sotiriades et al. 2000], transitive
closure of dynamic graphs [Huelsbergen
2000], Boolean satisfiability [Zhong et al.
1998], data compression [Huang et al.
2000], and genetic algorithms for the tra-
velling salesman problem [Graham and
Nelson 1996].
In order to achieve these performance
benefits, yet support a wide range of appli-
cations, reconfigurable systems are usu-
ally formed with a combination of re-
configurable logic and a general-purpose
microprocessor. The processor performs
the operations that cannot be done effi-
ciently in the reconfigurable logic, such
as data-dependent control and possibly
memory accesses, while the computational
cores are mapped to the reconfigurable
hardware. This reconfigurable logic can be
ACM Computing Surveys, Vol. 34, No. 2, June 2002.

Reconfigurable Computing 173
composed of either commercial FPGAs or
custom configurable hardware.
Compilation environments for reconfig-
urable hardware range from tools to assist
a programmer in performing a hand map-
ping of a circuit to the hardware, to com-
plete automated systems that take a cir-
cuit description in a high-level language
to a configuration for a reconfigurable sys-
tem. The design process involves first par-
titioning a program into sections to be im-
plemented on hardware, and those which
are to be implemented in software on the
host processor. The computations destined
for the reconfigurable hardware are syn-
thesized into a gate level or register trans-
fer level circuit description. This circuit is
mapped onto the logic blocks within the re-
configurable hardware during the technol-
ogy mapping phase. These mapped blocks
are then placed into the specific physi-
cal blocks within the hardware, and the
pieces of the circuit are connected using
the reconfigurable routing. After compi-
lation, the circuit is ready for configura-
tion onto the hardware at run-time. These
steps, when performed using an automatic
compilation system, require very little ef-
fort on the part of the programmer to
utilize the reconfigurable hardware. How-
ever, performing some or all of these oper-
ations by hand can result in a more highly
optimized circuit for performance-critical
applications.
Since FPGAs must pay an area penalty
because of their reconfigurability, device
capacity can sometimes be a concern. Sys-
tems that are configured only at power-
up are able to accelerate only as much
of the program as will fit within the pro-
grammable structures. Additional areas of
a program might be accelerated by reusing
the reconfigurable hardware during pro-
gram execution. This process is known
as run-time reconfiguration (RTR). While
this style of computing has the benefit of
allowing for the acceleration of a greater
portion of an application, it also introduces
the overhead of configuration, which lim-
its the amount of acceleration possible. Be-
cause configuration can take milliseconds
or longer, rapid and efficient configuration
is a critical issue. Methods such as config-
uration compression and the partial reuse
of already programmed configurations can
be used to reduce this overhead.
This article presents a survey of cur-
rent research in hardware and software
systems for reconfigurable computing, as
well as techniques that specifically target
run-time reconfigurability. We lead off this
discussion by examining the technology
required for reconfigurable computing, fol-
lowed by a more in-depth examination of
the various hardware structures used in
reconfigurable systems. Next, we look at
the software required for compilation of
algorithms to configurable computers, and
the trade-offs between hand-mapping and
automatic compilation. Finally, we discuss
run-time reconfigurable systems, which
further utilize the intrinsic flexibility of
configurable computing platforms by opti-
mizing the hardware not only for different
applications, but for different operations
within a single application as well.
This survey does not seek to cover ev-
ery technique and research project in the
area of reconfigurable computing. Instead,
it hopes to serve as an introduction to
this rapidly evolving field, bringing in-
terested readers quickly up to speed on
developments from the last half-decade.
Those interested in further background
can find coverage of older techniques
and systems elsewhere [Rose et al. 1993;
Hauck and Agarwal 1996; Vuillemin et al.
1996; Mangione-Smith et al. 1997; Hauck
1998b].
2. TECHNOLOGY
Reconfigurable computing as a concept
has been in existence for quite some time
[Estrin et al. 1963]. Even general-purpose
processors use some of the same basic
ideas, such as reusing computational com-
ponents for independent computations,
and using multiplexers to control the
routing between these components. How-
ever, the term reconfigurable comput-
ing, as it is used in current research
(and within this survey), refers to sys-
tems incorporating some form of hard-
ware programmability—customizing how
the hardware is used using a number
ACM Computing Surveys, Vol. 34, No. 2, June 2002.

174 K. Compton and S. Hauck
Fig. 1. A programming bit for SRAM-based FPGAs [Xilinx 1994] (left) and a pro-
grammable routing connection (right).
of physical control points. These control
points can then be changed periodically in
order to execute different applications us-
ing the same hardware.
The recent advances in reconfigurable
computing are for the most part de-
rived from the technologies developed
for FPGAs in the mid-1980s. FPGAs
were originally created to serve as a hy-
brid device between PALs and Mask-
Programmable Gate Arrays (MPGAs).
Like PALs, FPGAs are fully electrically
programmable, meaning that the physical
design costs are amortized over multiple
application circuit implementations, and
the hardware can be customized nearly in-
stantaneously. Like MPGAs, they can im-
plement very complex computations on a
single chip, with devices currently in pro-
duction containing the equivalent of over
a million gates. Because of these features,
FPGAs had been primarily viewed as glue-
logic replacement and rapid-prototyping
vehicles. However, as we show through-
out this article, the flexibility, capacity,
and performance of these devices has
opened up completely new avenues in
high-performance computation, forming
the basis of reconfigurable computing.
Most current FPGAs and reconfig-
urable devices are SRAM-programmable
(Figure 1 left), meaning that SRAM
1
bits are connected to the configuration
points in the FPGA, and programming
the SRAM bits configures the FPGA.
1
The term “SRAM” is technically incorrect for many
FPGA architectures, given that the configuration
memory may or may not support random access. In
fact, the configuration memory tends to be continu-
ally read in order to perform its function. However,
this is the generally accepted term in the field and
correctly conveys the concept of static volatile mem-
ory using an easily understandable label.
Thus, these chips can be programmed and
reprogrammed about as easily as a stan-
dard static RAM. In fact, one research
project, the PAM project [Vuillemin et al.
1996], considers a group of one or more
FPGAs to be a RAM unit that performs
computation between the memory write
(sending the configuration information
and input data) and memory read (read-
ing the results of the computation). This
leads some to use the term Programmable
Active Memory or PAM.
One example of how the SRAM configu-
ration points can be used is to control rout-
ing within a reconfigurable device [Chow
et al. 1999a]. To configure the routing on
an FPGA, typically a passgate structure
is employed (see Figure 1 right). Here the
programming bit will turn on a routing
connection when it is configured with a
true value, allowing a signal to flow from
one wire to another, and will disconnect
these resources when the bit is set to false.
With a proper interconnection of these ele-
ments, which may include millions of rout-
ing choice points within a single device, a
rich routing fabric can be created.
Another example of how these configu-
ration bits may be used is to control mul-
tiplexers, which will choose between the
output of different logic resources within
the array. For example, to provide optional
stateholding elements a D flip-flop (DFF)
may be included with a multiplexer se-
lecting whether to forward the latched
or unlatched signal value (see Figure 2
left). Thus, for systems that require state-
holding the programming bits controlling
the multiplexer would be configured to se-
lect the DFF output, while systems that
do not need this function would choose
the bypass route that sends the input di-
rectly to the output. Similar structures
ACM Computing Surveys, Vol. 34, No. 2, June 2002.

Reconfigurable Computing 175
Fig. 2. D flip-flop with optional bypass (left) and a 3-input LUT (right).
can choose between other on-chip func-
tionalities, such as fixed-logic computation
elements, memories, carry chains, or other
functions.
Finally, the configuration bits may be
used as control signals for a computational
unit or as the basis for computation it-
self. As a control signal, a configuration
bit may determine whether an ALU per-
forms an addition, subtraction, or other
logic computations. On the other hand,
with a structure such as a lookup table
(LUT), the configuration bits themselves
form the result of the computation (see
Figure 2 right). These elements are essen-
tially small memories provided for com-
puting arbitrary logic functions. LUTs can
compute any function of N inputs (where
N is the number of control signals for the
LUT’s multiplexer) by programming the
2N programming bits with the truth ta-
ble of the desired function. Thus, if all
programming bits except the one corre-
sponding to the input pattern 111 were
set to zero a 3-input LUT would act as a
3-input AND gate, while programming it
with all ones except in 000 would compute
a NAND.
3. HARDWARE
Reconfigurable computing systems use
FPGAs or other programmable hardware
to accelerate algorithm execution by map-
ping compute-intensive calculations to the
reconfigurable substrate. These hardware
resources are frequently coupled with a
general-purpose microprocessor that is
responsible for controlling the reconfig-
urable logic and executing program code
that cannot be efficiently accelerated. In
very closely coupled systems, the recon-
figurability lies within customizable func-
tional units on the regular datapath of
the microprocessor. On the other hand, a
reconfigurable computing system can be
as loosely coupled as a networked stand-
alone unit. Most reconfigurable systems
are categorized somewhere between these
two extremes, frequently with the recon-
figurable hardware acting as a coproces-
sor to a host microprocessor. The pro-
grammable array itself can be comprised
of one or more commercially available
FPGAs, or can be a custom device designed
specifically for reconfigurable computing.
The design of the actual computation
blocks within the reconfigurable hardware
varies from system to system. Each unit of
computation, or logic block, can be as sim-
ple as a 3-input lookup table (LUT), or as
complex as a 4-bit ALU. This difference
in block size is commonly referred to as
the granularity of the logic block, where
the 3-bit LUT is an example of a very
fine-grained computational element, and a
4-bit ALU is an example of a quite coarse-
grained unit. The finer-grained blocks are
useful for bit-level manipulations, while
the coarse-grained blocks are better opti-
mized for standard datapath applications.
Some architectures employ different sizes
or types of blocks within a single recon-
figurable array in order to efficiently sup-
port different types of computation. For
example, memory is frequently embedded
within the reconfigurable hardware to pro-
vide temporary data storage, forming a
heterogeneous structure composed of both
logic blocks and memory blocks [Ebeling
et al. 1996; Altera 1998; Lucent 1998;
Marshall et al. 1999; Xilinx 1999].
ACM Computing Surveys, Vol. 34, No. 2, June 2002.

Citations
More filters
Journal ArticleDOI

NoC synthesis flow for customized domain specific multiprocessor systems-on-chip

TL;DR: This work illustrates a complete synthesis flow, called Netchip, for customized NoC architectures, that partitions the development work into major steps (topology mapping, selection, and generation) and provides proper tools for their automatic execution (SUNMAP, xpipescompiler).
Patent

Runtime adaptable search processor

TL;DR: In this article, a runtime adaptable search processor is proposed that provides high speed content search capability to meet the performance need of network line rates growing to 1Gbps, IOGbps and higher.
Book

Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation

Scott Hauck, +1 more
TL;DR: This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology.
Journal ArticleDOI

The MOLEN polymorphic processor

TL;DR: A microarchitecture based on reconfigurable hardware emulation to allow high-speed reconfiguration and execution of the processor and to prove the viability of the proposal, the proposal was experimented with the MPEG-2 encoder and decoder and a Xilinx Virtex II Pro FPGA.
References
More filters
Book

Genetic Algorithms

Book ChapterDOI

VPR: A new packing, placement and routing tool for FPGA research

TL;DR: In terms of minimizing routing area, VPR outperforms all published FPGA place and route tools to which the authors can compare and presents placement and routing results on a new set of circuits more typical of today's industrial designs.
Proceedings ArticleDOI

Garp: a MIPS processor with a reconfigurable coprocessor

TL;DR: Novel aspects of the Garp Architecture are presented, as well as a prototype software environment and preliminary performance results, which suggest that a Garp of similar technology could achieve speedups ranging from a factor of 2 to as high as a factors of 24 for some useful applications.
Proceedings ArticleDOI

PathFinder: A Negotiation-Based Performance-Driven Router for FPGAs

TL;DR: PathFinder as mentioned in this paper uses an iterative algorithm that converges to a solution in which all signals are routed while achieving close to the optimal performance allowed by the placement, which is achieved by forcing signals to negotiate for a resource and thereby determine which signal needs the resource most.
Frequently Asked Questions (1)
Q1. What are the contributions in "Reconfigurable computing: a survey of systems and software" ?

In this survey, the authors explore the hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling. Finally, the authors consider the issues involved in run-time reconfigurable systems, which reuse the configurable hardware during program execution.