scispace - formally typeset
Search or ask a question

Showing papers on "Reconfigurable computing published in 1999"


Book
31 Mar 1999
TL;DR: From the Publisher: Architecture and CAD for Deep-Submicron FPGAs addresses several key issues in the design of high-performance FPGA architectures and CAD tools, with particular emphasis on issues that are important for FPG as implemented in deep-submicron processes.
Abstract: From the Publisher: Architecture and CAD for Deep-Submicron FPGAs addresses several key issues in the design of high-performance FPGA architectures and CAD tools, with particular emphasis on issues that are important for FPGAs implemented in deep-submicron processes. Three factors combine to determine the performance of an FPGA: the quality of the CAD tools used to map circuits into the FPGA, the quality of the FPGA architecture, and the electrical (i.e. transistor-level) design of the FPGA. Architecture and CAD for Deep-Submicron FPGAs examines all three of these issues in concert.

1,335 citations


Journal ArticleDOI
01 Feb 1999
TL;DR: The current status of EHW is reviewed, the promises and possible advantages of EhW are discussed, and the challenges must meet in order to develop practical and large-scale EHw are indicated.
Abstract: Evolvable hardware (EHW) has attracted increasing attention since the early 1990s with the advent of easily reconfigurable hardware, such as field programmable gate arrays (FPGAs). It promises to provide an entirely new approach to complex electronic circuit design and new adaptive hardware. EHW has been demonstrated to be able to perform a wide range of tasks, from pattern recognition to adaptive control. However, there are still many fundamental issues in EHW that remain open. This paper reviews the current status of EHW, discusses the promises and possible advantages of EHW, and indicates the challenges we must meet in order to develop practical and large-scale EHW.

342 citations


Proceedings ArticleDOI
01 Jun 1999
TL;DR: Reconfigurable computing is emerging as an important new organizational structure for implementing computations that combines the post-fabrication programmability of processors with the spatial computational style most commonly employed in hardware designs, changing traditional "hardware" and "software" boundaries.
Abstract: Reconfigurable computing is emerging as an important new organizational structure for implementing computations. It combines the post-fabrication programmability of processors with the spatial computational style most commonly employed in hardware designs. The result changes traditional "hardware" and "software" boundaries, providing an opportunity for greater computational capacity and density within a programmable media. Reconfigurable computing must leverage traditional CAD technology for building spatial designs. Beyond that, however, reprogrammability introduces new challenges and opportunities for automation, including binding-time and specialization optimizations, regularity extraction and exploitation, and temporal partitioning and scheduling.

271 citations


01 Jan 1999
TL;DR: The JBitsTM software is a set of JavaTM classes which provide an Application Programming Interface (API) to access the Xilinx FPGA bitstream, which permits all configurable resources like Look-up tables, routing and the flip-flops in the FPN to be individually configured under software control.
Abstract: The JBitsTM software is a set of JavaTM classes which provide an Application Programming Interface (API) to access the Xilinx FPGA bitstream The interface operates on either bitstreams generated by Xilinx design tools, or on bitstreams read back from actual hardware This permits all configurable resources like Look-up tables, routing and the flip-flops in the FPGA to be individually configured under software control

269 citations


Journal ArticleDOI
TL;DR: Key functionalities of the digital front-end are described and how the signal characteristics of mobile communications signals and commonalities among different signal processing operations can be exploited to great advantage, eventually enabling implementations on an ASIC that, although not reconfigurable, would empower the software radio concept.
Abstract: When expanding digital signal processing of mobile communications terminals toward the antenna while making the terminal more wideband in order to be able to cope with different mobile communications standards in a software radio based terminal, the designer is faced with strong requirements such as bandwidth and dynamic range. Many publications claim that only reconfigurable hardware such as FPGAs can simultaneously cope with such diversity and requirements. Starting with considerations of the receiver architecture, we describe key functionalities of the digital front-end and highlight how the signal characteristics of mobile communications signals and commonalities among different signal processing operations can be exploited to great advantage, eventually enabling implementations on an ASIC that, although not reconfigurable, would empower the software radio concept.

227 citations


Journal ArticleDOI
TL;DR: Three hypotheses are formulated: in the "design space" of possible electronic circuits, conventional design methods work within constrained regions, never considering most of the whole, but evolutionary algorithms can explore some of the regions beyond the scope of contentional methods, raising the possibility that better designs can be found.
Abstract: Three hypotheses are formulated. First, in the "design space" of possible electronic circuits, conventional design methods work within constrained regions, never considering most of the whole. Second, evolutionary algorithms can explore some of the regions beyond the scope of contentional methods, raising the possibility that better designs can be found. Third, evolutionary algorithms can in practice produce designs that are beyond the scope of conventional methods, and that are in some sense better. A reconfigurable hardware controller for a robot is evolved, using a conventional architecture with and without orthodox design constraints. In the unconstrained case, evolution exploited the enhanced capabilities of the hardware. A tone discriminator circuit is evolved on an FPGA without constraints, resulting in a structure and dynamics that are foreign to conventional design and analysis. The first two hypotheses are true. Evolution can explore the forms and processes that are natural to the electronic medium, and nonbehavioral requirements can be integrated into this design process, such as fault tolerance. A strategy to evolve circuit robustness tailored to the task, the circuit, and the medium, is presented. Hardware and software tools enabling research progress are discussed. The third hypothesis is a good working one: practically useful but radically unconventional evolved circuits are in sight.

225 citations


Proceedings ArticleDOI
01 Feb 1999
TL;DR: A reconfigurable architecture optimised for media processing, and based on 4-bit ALUs and interconnect, is described.
Abstract: In this paper we describe a reconfigurable architecture optimised for media processing, and based on 4-bit ALUs and interconnect.

224 citations


Proceedings ArticleDOI
01 Feb 1999
TL;DR: A novel reconfigurable computing array, the High-Speed, Hierarchical Synchronous Reconfigurable Array (HSRA), and its supporting tools are introduced, which demonstrates that computing arrays can achieve efficient, high-speed operation.
Abstract: There is no inherent characteristic forcing Field Programmable Gate Array (FPGA) or Reconfigurable Computing (RC) Array cycle times to be greater than processors in the same process. Modern FPGAs seldom achieve application clock rates close to their processor cousins because (1) resources in the FPGAs are not balanced appropriately for high-speed operation, (2) FPGA CAD does not automatically provide the requisite transforms to support this operation, and (3) interconnect delays can be large and vary almost continuously, complicating high frequency mapping. We introduce a novel reconfigurable computing array, the High-Speed, Hierarchical Synchronous Reconfigurable Array (HSRA), and its supporting tools. This packagedemonstrates that computing arrays can achieve efficient, high-speedoperation. We have designedand implemented a prototype component in a 0.4 m logic design on a DRAM process which will support 250MHz operation for CAD mapped designs.

195 citations


Proceedings ArticleDOI
14 Apr 1999
TL;DR: This contribution proposes arithmetic architectures which are optimized for modern field programmable gate arrays (FPGAs) and shows that it is possible to implement modular exponentiation at secure bit lengths on a single commercially available FPGA.
Abstract: It is widely recognized that security issues will play a crucial role in the majority of future computer and communication systems. Central tools for achieving system security are cryptographic algorithms. For performance as well as for physical security reasons, it is often advantageous to realize cryptographic algorithms in hardware. In order to overcome the well-known drawback of reduced flexibility that is associated with traditional ASIC solutions, this contribution proposes arithmetic architectures which are optimized for modern field programmable gate arrays (FPGAs). The proposed architectures perform modular exponentiation with very long integers. This operation is at the heart of many practical public-key algorithms such as RSA and discrete logarithm schemes. We combine the Montgomery modular multiplication algorithm with a new systolic array design, which is capable of processing a variable number of bits per array cell. The designs are flexible, allowing any choice of operand and modulus. Unlike previous approaches, we systematically implement and compare several variants of our new architecture for different bit lengths. We provide absolute area and timing measures for each architecture. The results allow conclusions about the feasibility and time-space trade-offs of our architecture for implementation on Xilinx XC4000 series FPGAs. As a major practical result we show that it is possible to implement modular exponentiation at secure bit lengths on a single commercially available FPGA.

192 citations


Proceedings ArticleDOI
01 Feb 1999
TL;DR: Initial evidence from a hierarchical array design is presented showing that high LUT utilization is not directly correlated with efficient silicon usage, and an algorithm for "depopulating" the gates in a hierarchical network to match the limited wiring resources is introduced.
Abstract: FPGA users often view the ability of an FPGA to route designs with high LUT (gate) utilization as a feature, leading them to demand high gate utilization from vendors. We present initial evidence from a hierarchical array design showing that high LUT utilization is not directly correlated with efficient silicon usage. Rather, since interconnect resources consume most of the area on these devices (often 80-90%), we can achieve more area efficient designs by allowing some LUTs to go unused—allowing us to use the dominant resource, interconnect, more efficiently. This extends the "Sea-ofgates" philosophy, familiar to mask programmable gate arrays, to FPGAs. Also introduced in this work is an algorithm for "depopulating" the gates in a hierarchical network to match the limited wiring resources.

186 citations


Book
15 Jan 1999
TL;DR: This is my doctoral thesis, written in late 1996, available in book form under Springer's distinguished dissertations series.
Abstract: This is my doctoral thesis, written in late 1996, available in book form under Springer's distinguished dissertations series It can be ordered direct from Springer, or from any bookseller including those on-line

Proceedings ArticleDOI
01 Feb 1999
TL;DR: The architecture of a custom computing machine that overcomes the interconnection bottleneck by closely integrating a fixed-logic processor, a reconfigurable logic array, and memory into a single chip, called OneChip-98 is described.
Abstract: As custom computing machines evolve, it is clear that a major bottleneck is the slow interconnection architecture between the logic and memory. This paper describes the architecture of a custom computing machine that overcomes the interconnection bottleneck by closely integrating a fixed-logic processor, a reconfigurable logic array, and memory into a single chip, called OneChip-98. The OneChip-98 system has a seamless programming model that enables the programmer to easily specify instructions without additional complex instruction decoding hardware. As well, there is a simple scheme for mapping instructions to the corresponding programming bits. To allow the processor and the reconfigurable array to execute concurrently, the programming model utilizes a novel memory-consistency scheme implemented in the hardware. To evaluate the feasibility of the OneChip-98 architecture, a 32-bit MIPS-like processor and several performance enhancement applications were mapped to the Transmogrifier-2 field programmable system. For two typical applications, the 2-dimensional discrete cosine transform and the 64-tap FIR filter, we were capable of achieving a performance speedup of over 30 times that of a stand-alone state-of-the-art processor.

Proceedings ArticleDOI
21 Apr 1999
TL;DR: A smart compilation chain in which the compiler is no longer limited by a pre-defined instruction set, but can generate application-specific custom instructions and synthesise them in Field-Programmable Logic to reduce the reconfiguration overhead and optimise the utilisation of resources is proposed.
Abstract: We propose a smart compilation chain in which the compiler is no longer limited by a pre-defined instruction set, but can generate application-specific custom instructions and synthesise them in Field-Programmable Logic. We also present a RISC micro-architecture enhanced by a CPLD-based Reconfigurable Functional Unit (RFU) which supports our compiler approach. The main difference between our smart compiler and similar methods is the ability to encode multiple custom instructions in a single RFU configuration, cross-minimising the logic among them. The objective is to reduce (or eliminate) the reconfiguration overhead and optimise the utilisation of resources. The CPLD core that implements the RFU is based on the Philips XPLA2 architecture. We discuss the advantages of using the XPLA2 instead of conventional FPGAs. Application examples are also presented, which show that our RFU-extended CPU can achieve speed-ups of more than 40% for encryption algorithms, when compared to the standard CPU core alone.

Proceedings ArticleDOI
21 Apr 1999
TL;DR: The method improves efficiency and ease of development of reconfigurable designs, particularly for users with little electronics design experience, and proposes several loop transformations to customize pipelines to meet hardware resource constraints, while maximising available parallelism.
Abstract: This paper presents pipeline vectorization, a method for synthesizing hardware pipelines in reconfigurable systems based on software vectorizing compilers. The method improves efficiency and ease of development of reconfigurable designs, particularly for users with little electronics design experience. We propose several loop transformations to customize pipelines to meet hardware resource constraints, while maximising available parallelism. For ran-time reconfigurable systems, we apply hardware specialization to increase circuit utilization. Our approach is especially effective for highly repetitive computations in DSP and multimedia applications. Case studies using FPGA-based platforms are presented to demonstrate the benefits of our approach and to evaluate trade-offs between alternative implementations. The loop tiling transformation, for instance, has been found to improve performance by 30 to 40 times above a PC-based software implementation, depending on whether run-time reconfiguration is used.

Proceedings ArticleDOI
12 Oct 1999
TL;DR: The Cameron Project is presented, which aims to provide a high level, algorithmic language and optimizing compiler for the development of image processing applications on reconfigurable computing systems (RCSs) using SA-C, a single assignment variant of the C programming language.
Abstract: This paper presents the Cameron Project, which aims to provide a high level, algorithmic language and optimizing compiler for the development of image processing applications on reconfigurable computing systems (RCSs). SA-C, a single assignment variant of the C programming language, is designed to exploit both coarse-grain and fine-grain parallelism in image processing applications. Khoros, a software development environment commonly used for image processing, has been modified to support SA-C program development. SA-C supports image processing with true multidimensional arrays, and with sophisticated array access and windowing mechanisms. Reduction operators such as medians and histograms are also provided. The optimizing compiler targets RCSs, which are fine-grained parallel processors made up of field programmable gate arrays (FPGAs), memories and interconnection hardware. They can be used as inexpensive co-processors with conventional workstations or PCs. This paper discusses compiler optimizations to generate optimal FPGA code using dataflow analysis techniques applied to data dependence graphs. Initial results are presented.

Journal ArticleDOI
01 Dec 1999
TL;DR: This paper is a review of the current state-of-the-art in the applications of the dataflow model of computation, focusing on three areas: multithreaded computing, signal processing and reconfigurable computing.
Abstract: The dataflow program graph execution model, or dataflow for short, is an alternative to the stored-program (von Neumann) execution model. Because it relies on a graph representation of programs, the strengths of the dataflow model are very much the complements of those of the stored-program one. In the last thirty or so years since it was proposed, the dataflow model of computation has been used and developed in very many areas of computing research: from programming languages to processor design, and from signal processing to reconfigurable computing. This paper is a review of the current state-of-the-art in the applications of the dataflow model of computation. It focuses on three areas: multithreaded computing, signal processing and reconfigurable computing. ” 1999 Elsevier Science B.V. All rights reserved.

Patent
15 Jan 1999
TL;DR: In this article, a reconfigurable computing system and method of use are provided for interfacing a plurality of application programs running on a host system to one or more hardware objects defined in configuration files.
Abstract: A reconfigurable computing system and method of use are provided for interfacing a plurality of application programs running on a host system to one or more hardware objects defined in one or more configuration files. The system includes reconfigurable computing circuitry comprising flexibly configurable circuitry operable for interfacing and implementing one or more hardware objects with one or more of the application programs. The system further includes memory circuitry associated with the reconfigurable computing circuitry for system information storage, and communications interfaces for connecting the reconfigurable computing circuitry and the memory to the host computer. The flexibly configurable circuitry further comprises one or more FPGAs and one or more programmable logic devices (“PLDs”), SRAM and EEPROM memory, and all the necessary connectors and support circuitry. The reconfigurable computing system and method of the present invention can be implemented on either a PCMCIA platform, a PCI platform, or any other bus structure without changing the basic functionality and claimed functionality of the reconfigurable computing system. Additionally, the reconfigurable computing system and method of this invention are well suited to be implemented in a portable computing environment.

Proceedings ArticleDOI
21 Apr 1999
TL;DR: This paper presents a new approach to synthesize to reconfigurable hardware (HW) user-specified regions of a program, under the assumption of "virtual HW" support, which exploits the temporal partitions at the behavior level, resolves memory access conflicts, and generates the VHDL descriptions at register-transfer level that will be mapped into the reconfigured HW devices.
Abstract: This paper presents a new approach to synthesize to reconfigurable hardware (HW) user-specified regions of a program, under the assumption of "virtual HW" support. The automation of this approach is supported by a compiler front-end and by an HW compiler under development. The front-end starts from the Java bytecodes and, therefore, supports any language that can be compiled to the JVM (Java Virtual Machine) model. It extracts from the bytecodes all the dependencies inside and between basic blocks. This information is stored in representation graphs more suitable to efficiently exploit the existent parallelism in the program than those typically used in high-level synthesis. From the intermediate representations the HW compiler exploits the temporal partitions at the behavior level, resolves memory access conflicts, and generates the VHDL descriptions at register-transfer level that will be mapped into the reconfigurable HW devices.

Journal ArticleDOI
TL;DR: A new gate-level model that handles time-multiplexed computation is proposed and an enchanced force directed scheduling (FDS) algorithm is introduced to partition sequential circuits that finds a correct partition with low logic and communication costs, under the assumption that maximum performance is desired.
Abstract: A fundamental feature of Dynamically Reconfigurable FPGAs (DRFPGAs) is that the logic and interconnect are time-multiplexed. Thus, for a circuit to be implemented on a DRFPGA, it needs to be partitioned such that each subcircuit can be executed at a different time. In this paper, the partitioning of sequential circuits for execution on a DRFPGA is studied. To determine how to correctly partition a sequential circuit and what are the costs in doing so, we propose a new gate-level model that handles time-multiplexed computation. We also introduce an enchanced force directed scheduling (FDS) algorithm to partition sequential circuits that finds a correct partition with low logic and communication costs, under the assumption that maximum performance is desired. We use our algorithm to partition seven large ISCAS'89 sequential benchmark circuits. The experimental results show that the enhanced FDS reduces communication costs by 27.5 percent with only a 1.1 percent increase in the gate cost compared to traditional FDS.

Journal ArticleDOI
Jack Jean1, Karen A. Tomko1, V. Yavagal1, J. Shah1, R. Cook1 
TL;DR: The development of a dynamically reconfigurable system that can support multiple applications running concurrently and the impact of supporting concurrency and preloading in reducing application execution time is demonstrated.
Abstract: This paper describes the development of a dynamically reconfigurable system that can support multiple applications running concurrently. A dynamically reconfigurable system allows hardware reconfiguration while part of the reconfigurable hardware is busy computing. An FPGA resource manager (RM) is developed to allocate and de-allocate FPGA resources and to preload FPGA configuration files. For each individual application, different tasks that require FPGA resources are represented as a flow graph which is made available to the RM so as to enable efficient resource management and preloading. The performance of using the RM to support several applications is summarized. The impact of supporting concurrency and preloading in reducing application execution time is demonstrated.

Proceedings ArticleDOI
15 Feb 1999
TL;DR: The dynamically reconfigurable logic engine (DRLE) prototype described meets this challenge to achieve both hardware efficiency and software programmability by dynamically reconfigured FPGAs.
Abstract: Reconfigurable logic LSIs, such as FPGAs, have been perceived as devices for prototyping and emulation. As the size and speed of FPGAs rapidly increase, however, they have begun to be used in /spl mu/P-based systems as reconfigurable accelerators. The idea is to achieve both hardware efficiency and software programmability by dynamically reconfiguring FPGAs. This idea, reconfigurable computing, provides an attractive solution especially for media/network-centric applications. Various types of reconfiguration scenarios in such applications, however, require logic LSIs to significantly enhance reconfigurability in three respects: (1) agility-reconfiguration may need to take place in very short intervals, say within a hundred /spl mu/P instructions; (2) controllability-reconfiguration may be controlled from an external /spl mu/P or by itself; (3) flexibility-reconfiguration target may be arbitrarily positioned and irregularly shaped. The dynamically reconfigurable logic engine (DRLE) prototype described meets this challenge.

Proceedings ArticleDOI
21 Apr 1999
TL;DR: New compression algorithms for FPGA configurations that can significantly reduce this overhead are developed, which results in a single compression methodology which achieves higher compression ratios than existing algorithms in an off-line version, as well as a somewhat lower quality compression approach for on-line use in dynamic circuit generation and other mapping-time critical situations.
Abstract: The time it takes to reconfigure FPGAs can be a significant overhead for reconfigurable computing. In this paper we develop new compression algorithms for FPGA configurations that can significantly reduce this overhead. By using runlength and other compression techniques, files can be compressed by a factor of 3.6 times. Bus transfer and decompression hardware are also discussed. This results in a single compression methodology which achieves higher compression ratios than existing algorithms in an off-line version, as well as a somewhat lower quality compression approach which is suitable for on-line use in dynamic circuit generation and other mapping-time critical situations.

Journal ArticleDOI
TL;DR: The use of the custom computing approach to meet the computation and communication needs of computer vision algorithms by customizing hardware architecture at the instruction level for every application so that the optimal grain size needed for the problem at hand and the instruction granularity can be matched.
Abstract: Computer vision algorithms are natural candidates for high performance computing systems. Algorithms in computer vision are characterized by complex and repetitive operations on large amounts of data involving a variety of data interactions (e.g., point operations, neighborhood operations, global operations). In this paper, we describe the use of the custom computing approach to meet the computation and communication needs of computer vision algorithms. By customizing hardware architecture at the instruction level for every application, the optimal grain size needed for the problem at hand and the instruction granularity can be matched. A custom computing approach can also reuse the same hardware by reconfiguring at the software level for different levels of the computer vision application. We demonstrate the advantages of our approach using Splash 2-a Xilinx 4010-based custom computer.

Proceedings ArticleDOI
01 Feb 1999
TL;DR: This paper describes the hardware implementation of the Generalized Profile Search algorithm using online arithmetic and redundant, data representation, which leads to a significant increase of data throughput in comparison with standard non redundant data coding.
Abstract: This paper describes the hardware implementation of the Generalized Profile Search algorithm using online arithmetic and redundant, data representation. This is part of the GenStorm project, aimed at providing a dedicated computer for biological sequence processing based on reconfigurable hardware using FPGAs. The serial evaluation of the result made possible by a redundant data representa.tion leads to a significant increase of data throughput in comparison with standard non redundant data coding.

Journal ArticleDOI
TL;DR: An algorithm is developed, targeted to the decompression hardware imbedded in the Xilinx XC6200 series field-programmable gate array architecture, that can radically reduce the amount of data needed to transfer during reconfiguration.
Abstract: One of the major overheads in reconfigurable computing is the time it takes to reconfigure the devices in the system. This overhead limits the speedups possible in this exciting new paradigm. In this paper me explore one technique for reducing this overhead: the compression of configuration datastreams. We develop an algorithm, targeted to the decompression hardware imbedded in the Xilinx XC6200 series field-programmable gate array architecture, that can radically reduce the amount of data needed to transfer during reconfiguration. This results in an overall reduction of about a factor of four in total bandwidth required for reconfiguration.

Journal ArticleDOI
24 Oct 1999
TL;DR: A high resolution PET scanner requiring processing electronics for 936 block technology channels and just under sixty-thousand crystal elements has been developed and an FPGA implementation of the front-end processing electronics was chosen over the traditional discrete logic or Application Specific Integrated Circuit.
Abstract: A high resolution PET scanner requiring processing electronics for 936 block technology channels and just under sixty-thousand crystal elements has been developed. With the advances in flexibility, number of gates, lower costs and speed of Field Programmable Gate Arrays (FPGA), an FPGA implementation of the front-end processing electronics was chosen over the traditional discrete logic or Application Specific Integrated Circuit (ASIC). The FPGA architecture reduced the development time and risks compared to a mask-based ASIC architecture while keeping costs and electronics packing density comparable. The extensive use FPGAs enables much faster circuit realization and a very efficient logic utilization by allowing re-configuration of the electronics functionality to support system setup, self-diagnostics, and several calibration modes for detector setup. Logic realized within the FPGAs performs the crystal selection, energy qualification, time correction, depth of interaction determination, and event counting functions. Since the FPGAs are in-circuit re-configurable (ICR), the functionality of the electronics is easily modified to support the different modes of operation. Thus the development time is reduced as well as the amount of electronics required, saving board area, power consumption and costs.

Proceedings ArticleDOI
07 Apr 1999
TL;DR: An approach to automated synthesis of CMOS circuits, based on evolution on a programmable transistor array (PTA) is introduced, illustrated with a software experiment showing evolutionary synthesis of a circuit with a desired DC characteristic.
Abstract: Evolvable hardware is reconfigurable hardware that self-configures under the control of an evolutionary algorithm. The search for a hardware configuration can be performed using software models or, faster and more accurate, directly, in reconfigurable hardware. Several experiments have demonstrated the possibility to automatically synthesize both digital and analog circuits. The paper introduces an approach to automated synthesis of CMOS circuits, based on evolution on a programmable transistor array (PTA). The approach is illustrated with a software experiment showing evolutionary synthesis of a circuit with a desired DC characteristic. A hardware implementation of a test PTA chip is then described, and the same evolutionary experiment is performed on the chip demonstrating circuit synthesis/self-configuration directly in hardware.

Proceedings ArticleDOI
10 Jan 1999
TL;DR: A novel approach for rapidly testing the interconnect in the FPGAs each time the system is reconfigured using the "walking-1" approach and a low-cost configuration-dependent test method is used to both detect and locate faults in the interConnect.
Abstract: An FPGA-based reconfigurable system may contain boards of FPGAs which are reconfigured for different applications and must work correctly. This paper presents a novel approach for rapidly testing the interconnect in the FPGAs each time the system is reconfigured. A low-cost configuration-dependent test method is used to both detect and locate faults in the interconnect. The "original configuration" is modified by only changing the logic function of the CLBs to form "test configurations" that can be used to quickly test the interconnect using the "walking-1" approach. The test procedure is rapid enough to be performed on the fly whenever the system is reconfigured. All stuck-at faults and bridging faults in the interconnect are guaranteed to be detected and located with a short test length. The fault location information can he used to reconfigure the system to avoid the faulty hardware.

Proceedings ArticleDOI
01 Jan 1999
TL;DR: This paper shows how the different execution orders for these sub-tasks may result in varying levels of performance and formulate an analytical approach and present a solution for this new problem through this work.
Abstract: Reconfigurable computing is a flexible way of facing with a single device a wide range of applications with a good level of performance. This area of computing involves different issues and concepts when compared with conventional computing systems. One of these concepts is context lending. The context refers to the coded configuration information to implement a particular circuit behaviour. An important problem for reconfigurable computing is the scheduling of a group of kernels (sub-tasks) that constitute a complex application for minimum execution time. In this paper, we show how the different execution orders for these sub-tasks may result in varying levels of performance. We formulate an analytical approach and present a solution for this new problem through this work.

Book ChapterDOI
12 Aug 1999
TL;DR: This paper describes how various proposed AES algorithms could be implemented on PipeRench, a reconfigurable fabric that supports implementations which can yield better than custom-hardware performance and yet maintains all the flexibility of software based systems.
Abstract: Cryptographic algorithms are more efficiently implemented in custom hardware than in software running on general-purpose processors. However, systems which use hardware implementations have significant drawbacks: they are unable to respond to flaws discovered in the implemented algorithm or to changes in standards. In this paper we show how reconfigurable computing offers high performance yet flexible solutions for cryptographic algorithms. We focus on PipeRench, a reconfigurable fabric that supports implementations which can yield better than custom-hardware performance and yet maintains all the flexibility of software based systems. PipeRench is a pipelined reconfigurable fabric which virtualizes hardware, enabling large circuits to be run on limited physical hardware. We present implementations for Crypton, IDEA, RC6, and Twofish on PipeRench and an extension of PipeRench, PipeRench+. We also describe how various proposed AES algorithms could be implemented on PipeRench. PipeRench achieves speedups of between 2x and 12x over conventional processors.