What are the contributions in "Reconfigurable computing: a survey of systems and software" ?

Q: What are the contributions in "Reconfigurable computing: a survey of systems and software" ?

In this survey, the authors explore the hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling. Finally, the authors consider the issues involved in run-time reconfigurable systems, which reuse the configurable hardware during program execution.

(Open Access) Reconfigurable computing: a survey of systems and software (2002) | Katherine Compton

Reconﬁgurable Computing: A Survey of Systems and Software

KATHERINE COMPTON

Northwestern University

AND

SCOTT HAUCK

University of Washington

Due to its potential to greatly accelerate a wide variety of applications, reconﬁgurable

computing has become a subject of a great deal of research. Its key feature is the ability

to perform computations in hardware to increase performance, while retaining much of

the ﬂexibility of a software solution. In this survey, we explore the hardware aspects of

reconﬁgurable computing machines, from single chip architectures to multi-chip

systems, including internal structures and external coupling. We also focus on the

software that targets these machines, such as compilation tools that map high-level

algorithms directly to the reconﬁgurable substrate. Finally, we consider the issues

involved in run-time reconﬁgurable systems, which reuse the conﬁgurable hardware

during program execution.

Categories and Subject Descriptors: A.1 [Introductory and Survey]; B.6.1 [Logic

Design]: Design Style—logic arrays; B.6.3 [Logic Design]: Design Aids; B.7.1

[Integrated Circuits]: Types and Design Styles—gate arrays

General Terms: Design, Performance

Additional Key Words and Phrases: Automatic design, ﬁeld-programmable, FPGA,

manual design, reconﬁgurable architectures, reconﬁgurable computing, reconﬁgurable

systems

1. INTRODUCTION

There are two primary methods in con-

ventional computing for the execution

This research was supported in part by Motorola, Inc., DARPA, and NSF.

K. Compton was supported by an NSF fellowship.

S. Hauck was supported in part by an NSF CAREER award and a Sloan Research Fellowship.

Authors’ addresses: K. Compton, Department of Electrical and Computer Engineering, Northwestern Uni-

versity, 2145 Sheridan Road, Evanston, IL 60208-3118; e-mail: kati@ece.northwestern.edu; S. Hauck, De-

partment of Electrical Engineering, The University of Washington, Box 352500, Seattle, WA 98195; e-mail:

hauck@ee.washington.edu.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted

without fee provided that copies are not made or distributed for proﬁt or direct commercial advantage and

that copies show this notice on the ﬁrst page or initial screen of a display along with the full citation.

Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit

is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any compo-

nent of this work in other works requires prior speciﬁc permission and/or a fee. Permissions may be requested

from Publications Dept., ACM, Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or

permissions@acm.org.

2002 ACM 0360-0300/02/0600-0171 $5.00

of algorithms. The ﬁrst is to use hard-

wired technology, either an Application

Speciﬁc Integrated Circuit (ASIC) or a

group of individual components forming a

ACM Computing Surveys, Vol. 34, No. 2, June 2002, pp. 171–210.

172 K. Compton and S. Hauck

board-level solution, to perform the oper-

ations in hardware. ASICs are designed

speciﬁcally to perform a given computa-

tion, and thus they are very fast and

efﬁcient when executing the exact com-

putation for which they were designed.

However, the circuit cannot be altered af-

ter fabrication. This forces a redesign and

refabrication of the chip if any part of its

circuit requires modiﬁcation. This is an ex-

pensive process, especially when one con-

siders the difﬁculties in replacing ASICs

in a large number of deployed systems.

Board-level circuits are also somewhat in-

ﬂexible, frequently requiring a board re-

design and replacement in the event of

changes to the application.

The second method is to use soft-

ware-programmed microprocessors—a far

more ﬂexible solution. Processors execute

a set of instructions to perform a compu-

tation. By changing the software instruc-

tions, the functionality of the system is

altered without changing the hardware.

However, the downside of this ﬂexibility

is that the performance can suffer, if not

in clock speed then in work rate, and is

far below that of an ASIC. The processor

must read each instruction from memory,

decode its meaning, and only then exe-

cute it. This results in a high execution

overhead for each individual operation.

Additionally, the set of instructions that

may be used by a program is determined

at the fabrication time of the processor.

Any other operations that are to be im-

plemented must be built out of existing

instructions.

Reconﬁgurable computing is intended to

ﬁll the gap between hardware and soft-

ware, achieving potentially much higher

performance than software, while main-

taining a higher level of ﬂexibility than

hardware. Reconﬁgurable devices, in-

cluding ﬁeld-programmable gate arrays

(FPGAs), contain an array of computa-

tional elements whose functionality is de-

termined through multiple programmable

conﬁguration bits. These elements, some-

times known as logic blocks, are connected

using a set of routing resources that are

also programmable. In this way, custom

digital circuits can be mapped to the recon-

ﬁgurable hardware by computing the logic

functions of the circuit within the logic

blocks, and using the conﬁgurable routing

to connect the blocks together to form the

necessary circuit.

FPGAs and reconﬁgurable computing

have been shown to accelerate a variety of

applications. Data encryption, for exam-

ple, is able to leverage both parallelism

and ﬁne-grained data manipulation. An

implementation of the Serpent Block

Cipher in the Xilinx Virtex XCV1000

shows a throughput increase by a factor

of over 18 compared to a Pentium Pro

PC running at 200 MHz [Elbirt and Paar

2000]. Additionally, a reconﬁgurable com-

puting implementation of sieving for fac-

toring large numbers (useful in breaking

encryption schemes) was accelerated by a

factor of 28 over a 200-MHz UltraSparc

workstation [Kim and Mangione-Smith

2000]. The Garp architecture shows a

comparable speed-up for DES [Hauser

and Wawrzynek 1997], as does an

FPGA implementation of an elliptic curve

cryptography application [Leung et al.

2000].

Other recent applications that have

been shown to exhibit signiﬁcant speed-

ups using reconﬁgurable hardware

include: automatic target recognition

[Rencher and Hutchings 1997], string pat-

tern matching [Weinhardt and Luk 1999],

Golomb Ruler Derivation [Dollas et al.

1998; Sotiriades et al. 2000], transitive

closure of dynamic graphs [Huelsbergen

2000], Boolean satisﬁability [Zhong et al.

1998], data compression [Huang et al.

2000], and genetic algorithms for the tra-

velling salesman problem [Graham and

Nelson 1996].

In order to achieve these performance

beneﬁts, yet support a wide range of appli-

cations, reconﬁgurable systems are usu-

ally formed with a combination of re-

conﬁgurable logic and a general-purpose

microprocessor. The processor performs

the operations that cannot be done efﬁ-

ciently in the reconﬁgurable logic, such

as data-dependent control and possibly

memory accesses, while the computational

cores are mapped to the reconﬁgurable

hardware. This reconﬁgurable logic can be

ACM Computing Surveys, Vol. 34, No. 2, June 2002.

Reconﬁgurable Computing 173

composed of either commercial FPGAs or

custom conﬁgurable hardware.

Compilation environments for reconﬁg-

urable hardware range from tools to assist

a programmer in performing a hand map-

ping of a circuit to the hardware, to com-

plete automated systems that take a cir-

cuit description in a high-level language

to a conﬁguration for a reconﬁgurable sys-

tem. The design process involves ﬁrst par-

titioning a program into sections to be im-

plemented on hardware, and those which

are to be implemented in software on the

host processor. The computations destined

for the reconﬁgurable hardware are syn-

thesized into a gate level or register trans-

fer level circuit description. This circuit is

mapped onto the logic blocks within the re-

conﬁgurable hardware during the technol-

ogy mapping phase. These mapped blocks

are then placed into the speciﬁc physi-

cal blocks within the hardware, and the

pieces of the circuit are connected using

the reconﬁgurable routing. After compi-

lation, the circuit is ready for conﬁgura-

tion onto the hardware at run-time. These

steps, when performed using an automatic

compilation system, require very little ef-

fort on the part of the programmer to

utilize the reconﬁgurable hardware. How-

ever, performing some or all of these oper-

ations by hand can result in a more highly

optimized circuit for performance-critical

applications.

Since FPGAs must pay an area penalty

because of their reconﬁgurability, device

capacity can sometimes be a concern. Sys-

tems that are conﬁgured only at power-

up are able to accelerate only as much

of the program as will ﬁt within the pro-

grammable structures. Additional areas of

a program might be accelerated by reusing

the reconﬁgurable hardware during pro-

gram execution. This process is known

as run-time reconﬁguration (RTR). While

this style of computing has the beneﬁt of

allowing for the acceleration of a greater

portion of an application, it also introduces

the overhead of conﬁguration, which lim-

its the amount of acceleration possible. Be-

cause conﬁguration can take milliseconds

or longer, rapid and efﬁcient conﬁguration

is a critical issue. Methods such as conﬁg-

uration compression and the partial reuse

of already programmed conﬁgurations can

be used to reduce this overhead.

This article presents a survey of cur-

rent research in hardware and software

systems for reconﬁgurable computing, as

well as techniques that speciﬁcally target

run-time reconﬁgurability. We lead off this

discussion by examining the technology

required for reconﬁgurable computing, fol-

lowed by a more in-depth examination of

the various hardware structures used in

reconﬁgurable systems. Next, we look at

the software required for compilation of

algorithms to conﬁgurable computers, and

the trade-offs between hand-mapping and

automatic compilation. Finally, we discuss

run-time reconﬁgurable systems, which

further utilize the intrinsic ﬂexibility of

conﬁgurable computing platforms by opti-

mizing the hardware not only for different

applications, but for different operations

within a single application as well.

This survey does not seek to cover ev-

ery technique and research project in the

area of reconﬁgurable computing. Instead,

it hopes to serve as an introduction to

this rapidly evolving ﬁeld, bringing in-

terested readers quickly up to speed on

developments from the last half-decade.

Those interested in further background

can ﬁnd coverage of older techniques

and systems elsewhere [Rose et al. 1993;

Hauck and Agarwal 1996; Vuillemin et al.

1996; Mangione-Smith et al. 1997; Hauck

1998b].

2. TECHNOLOGY

Reconﬁgurable computing as a concept

has been in existence for quite some time

[Estrin et al. 1963]. Even general-purpose

processors use some of the same basic

ideas, such as reusing computational com-

ponents for independent computations,

and using multiplexers to control the

routing between these components. How-

ever, the term reconﬁgurable comput-

ing, as it is used in current research

(and within this survey), refers to sys-

tems incorporating some form of hard-

ware programmability—customizing how

the hardware is used using a number

ACM Computing Surveys, Vol. 34, No. 2, June 2002.

174 K. Compton and S. Hauck

Fig. 1. A programming bit for SRAM-based FPGAs [Xilinx 1994] (left) and a pro-

grammable routing connection (right).

of physical control points. These control

points can then be changed periodically in

order to execute different applications us-

ing the same hardware.

The recent advances in reconﬁgurable

computing are for the most part de-

rived from the technologies developed

for FPGAs in the mid-1980s. FPGAs

were originally created to serve as a hy-

brid device between PALs and Mask-

Programmable Gate Arrays (MPGAs).

Like PALs, FPGAs are fully electrically

programmable, meaning that the physical

design costs are amortized over multiple

application circuit implementations, and

the hardware can be customized nearly in-

stantaneously. Like MPGAs, they can im-

plement very complex computations on a

single chip, with devices currently in pro-

duction containing the equivalent of over

a million gates. Because of these features,

FPGAs had been primarily viewed as glue-

logic replacement and rapid-prototyping

vehicles. However, as we show through-

out this article, the ﬂexibility, capacity,

and performance of these devices has

opened up completely new avenues in

high-performance computation, forming

the basis of reconﬁgurable computing.

Most current FPGAs and reconﬁg-

urable devices are SRAM-programmable

(Figure 1 left), meaning that SRAM

bits are connected to the conﬁguration

points in the FPGA, and programming

the SRAM bits conﬁgures the FPGA.

The term “SRAM” is technically incorrect for many

FPGA architectures, given that the conﬁguration

memory may or may not support random access. In

fact, the conﬁguration memory tends to be continu-

ally read in order to perform its function. However,

this is the generally accepted term in the ﬁeld and

correctly conveys the concept of static volatile mem-

ory using an easily understandable label.

Thus, these chips can be programmed and

reprogrammed about as easily as a stan-

dard static RAM. In fact, one research

project, the PAM project [Vuillemin et al.

1996], considers a group of one or more

FPGAs to be a RAM unit that performs

computation between the memory write

(sending the conﬁguration information

and input data) and memory read (read-

ing the results of the computation). This

leads some to use the term Programmable

Active Memory or PAM.

One example of how the SRAM conﬁgu-

ration points can be used is to control rout-

ing within a reconﬁgurable device [Chow

et al. 1999a]. To conﬁgure the routing on

an FPGA, typically a passgate structure

is employed (see Figure 1 right). Here the

programming bit will turn on a routing

connection when it is conﬁgured with a

true value, allowing a signal to ﬂow from

one wire to another, and will disconnect

these resources when the bit is set to false.

With a proper interconnection of these ele-

ments, which may include millions of rout-

ing choice points within a single device, a

rich routing fabric can be created.

Another example of how these conﬁgu-

ration bits may be used is to control mul-

tiplexers, which will choose between the

output of different logic resources within

the array. For example, to provide optional

stateholding elements a D ﬂip-ﬂop (DFF)

may be included with a multiplexer se-

lecting whether to forward the latched

or unlatched signal value (see Figure 2

left). Thus, for systems that require state-

holding the programming bits controlling

the multiplexer would be conﬁgured to se-

lect the DFF output, while systems that

do not need this function would choose

the bypass route that sends the input di-

rectly to the output. Similar structures

ACM Computing Surveys, Vol. 34, No. 2, June 2002.

Reconﬁgurable Computing 175

Fig. 2. D ﬂip-ﬂop with optional bypass (left) and a 3-input LUT (right).

can choose between other on-chip func-

tionalities, such as ﬁxed-logic computation

elements, memories, carry chains, or other

functions.

Finally, the conﬁguration bits may be

used as control signals for a computational

unit or as the basis for computation it-

self. As a control signal, a conﬁguration

bit may determine whether an ALU per-

forms an addition, subtraction, or other

logic computations. On the other hand,

with a structure such as a lookup table

(LUT), the conﬁguration bits themselves

form the result of the computation (see

Figure 2 right). These elements are essen-

tially small memories provided for com-

puting arbitrary logic functions. LUTs can

compute any function of N inputs (where

N is the number of control signals for the

LUT’s multiplexer) by programming the

2N programming bits with the truth ta-

ble of the desired function. Thus, if all

programming bits except the one corre-

sponding to the input pattern 111 were

set to zero a 3-input LUT would act as a

3-input AND gate, while programming it

with all ones except in 000 would compute

a NAND.

3. HARDWARE

Reconﬁgurable computing systems use

FPGAs or other programmable hardware

to accelerate algorithm execution by map-

ping compute-intensive calculations to the

reconﬁgurable substrate. These hardware

resources are frequently coupled with a

general-purpose microprocessor that is

responsible for controlling the reconﬁg-

urable logic and executing program code

that cannot be efﬁciently accelerated. In

very closely coupled systems, the recon-

ﬁgurability lies within customizable func-

tional units on the regular datapath of

the microprocessor. On the other hand, a

reconﬁgurable computing system can be

as loosely coupled as a networked stand-

alone unit. Most reconﬁgurable systems

are categorized somewhere between these

two extremes, frequently with the recon-

ﬁgurable hardware acting as a coproces-

sor to a host microprocessor. The pro-

grammable array itself can be comprised

of one or more commercially available

FPGAs, or can be a custom device designed

speciﬁcally for reconﬁgurable computing.

The design of the actual computation

blocks within the reconﬁgurable hardware

varies from system to system. Each unit of

computation, or logic block, can be as sim-

ple as a 3-input lookup table (LUT), or as

complex as a 4-bit ALU. This difference

in block size is commonly referred to as

the granularity of the logic block, where

the 3-bit LUT is an example of a very

ﬁne-grained computational element, and a

4-bit ALU is an example of a quite coarse-

grained unit. The ﬁner-grained blocks are

useful for bit-level manipulations, while

the coarse-grained blocks are better opti-

mized for standard datapath applications.

Some architectures employ different sizes

or types of blocks within a single recon-

ﬁgurable array in order to efﬁciently sup-

port different types of computation. For

example, memory is frequently embedded

within the reconﬁgurable hardware to pro-

vide temporary data storage, forming a

heterogeneous structure composed of both

logic blocks and memory blocks [Ebeling

et al. 1996; Altera 1998; Lucent 1998;

Marshall et al. 1999; Xilinx 1999].

ACM Computing Surveys, Vol. 34, No. 2, June 2002.

Reconfigurable computing: a survey of systems and software

Figures

Citations

コンピュータ・サイエンス : ACM computing surveys

NoC synthesis flow for customized domain specific multiprocessor systems-on-chip

Runtime adaptable search processor

Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation

The MOLEN polymorphic processor

References

Genetic Algorithms

VPR: A new packing, placement and routing tool for FPGA research

コンピュータ・サイエンス : ACM computing surveys

Garp: a MIPS processor with a reconfigurable coprocessor

PathFinder: A Negotiation-Based Performance-Driven Router for FPGAs

Related Papers (5)

Garp: a MIPS processor with a reconfigurable coprocessor

PipeRench: a reconfigurable architecture and compiler

MATRIX: A Reconfigurable Computing Architecture with Configurable Instruction Distribution and Deployable Resources

RaPiD - Reconfigurable Pipelined Datapath

A programmable logic device which stores more than one configuration and means for switching configurations

Frequently Asked Questions (1)

Q1. What are the contributions in "Reconfigurable computing: a survey of systems and software" ?