scispace - formally typeset
Search or ask a question

Showing papers on "Field-programmable gate array published in 2001"


Book
15 Sep 2001
TL;DR: This edition has a new chapter on microprocessors, new sections on special functions using MAC calls, intellectual property core design and arbitrary sampling rate converters, and over 100 new exercises.
Abstract: Field-Programmable Gate Arrays (FPGAs) are revolutionizing digital signal processing as novel FPGA families are replacing ASICs and PDSPs for front-end digital signal processing algorithms So the efficient implementation of these algorithms is critical and is the main goal of this book It starts with an overview of today's FPGA technology, devices, and tools for designing state-of-the-art DSP systems A case study in the first chapter is the basis for more than 40 design examples throughout The following chapters deal with computer arithmetic concepts, theory and the implementation of FIR and IIR filters, multirate digital signal processing systems, DFT and FFT algorithms, advanced algorithms with high future potential, and adaptive filters Each chapter contains exercises The VERILOG source code and a glossary are given in the appendices, while the accompanying CD-ROM contains the examples in VHDL and Verilog code as well as the newest Altera "Quartus II web edition" software This edition has a new chapter on microprocessors, new sections on special functions using MAC calls, intellectual property core design and arbitrary sampling rate converters, and over 100 new exercises

615 citations


Journal ArticleDOI
01 May 2001
TL;DR: A survey of academic research and commercial development in reconfigurable computing for DSP systems over the past fifteen years is presented in this article, with a focus on the application domain of digital signal processing.
Abstract: Steady advances in VLSI technology and design tools have extensively expanded the application domain of digital signal processing over the past decade. While application-specific integrated circuits (ASICs) and programmable digital signal processors (PDSPs) remain the implementation mechanisms of choice for many DSP applications, increasingly new system implementations based on reconfigurable computing are being considered. These flexible platforms, which offer the functional efficiency of hardware and the programmability of software, are quickly maturing as the logic capacity of programmable devices follows Moore's Law and advanced automated design techniques become available. As initial reconfigurable technologies have emerged, new academic and commercial efforts have been initiated to support power optimization, cost reduction, and enhanced run-time performance. This paper presents a survey of academic research and commercial development in reconfigurable computing for DSP systems over the past fifteen years. This work is placed in the context of other available DSP implementation media including ASICs and PDSPs to fully document the range of design choices available to system engineers. It is shown that while contemporary reconfigurable computing can be applied to a variety of DSP applications including video, audio, speech, and control, much work remains to realize its full potential. While individual implementations of PDSP, ASIC, and reconfigurable resources each offer distinct advantages, it is likely that integrated combinations of these technologies will provide more complete solutions.

390 citations


Patent
19 Oct 2001
TL;DR: In this article, the authors present a real-time tool to select a set of software modules and hardware configuration files from a series of libraries, which are then downloaded to the reconfigurable hardware.
Abstract: A reconfigurable test system including a host computer coupled to a reconfigurable test instrument. The reconfigurable test instrument includes reconfigurable hardware—i.e. a reconfigurable hardware module with one or more programmable elements such as Field Programmable Gate Arrays for realizing an arbitrary hardware architecture and a reconfigurable front end with programmable transceivers for interfacing with any desired physical medium—and optionally, an embedded processor. A user specifies system features with a software configuration utility which directs a component selector to select a set of software modules and hardware configuration files from a series of libraries. The modules are embedded in a host software driver or downloaded for execution on the embedded CPU. The configuration files are downloaded to the reconfigurable hardware. The entire selection process is performed in real-time and can be changed whenever the user deems necessary. Alternatively, the user may create a graphical program in a graphical programming environment and compile the program into various software modules and configuration files for host execution, embedded processor execution, or programming the reconfigurable hardware.

212 citations


Proceedings ArticleDOI
01 Feb 2001
TL;DR: A prototype platform has been developed that allows processing of packets at the edge of a multi-gigabit-per-second network switch and simplifies the development and deployment of new hardware-accelerated packet processing circuits.
Abstract: A prototype platform has been developed that allows processing of packets at the edge of a multi-gigabit-per-second network switch. This system, the Field Programmable Port Extender (FPX), enables packet processing functions to be implemented as modular components in reprogrammable hardware. All logic on the on the FPX is implemented in two Field Programmable Gate Arrays (FPGAs). Packet processing functions in the system are implemented as dynamically-loadable modules.Core functionality of the FPX is implemented on an FPGA called the Networking Interface Device (NID). The NID contains the logic to transmit and receive packets over a network, dynamically reprogram hardware modules, and route individual traffic flows. A full, non-blocking, switch is implemented on the NID to route packets between the networking interfaces and the modular components. Modular components of the FPX are implemented on a second FPGA called the Reprogrammable Application Device (RAD). Modules are loaded onto the RAD via reconfiguration and/or partial partial reconfiguration of the FPGA.Through the combination of the NID and the RAD, the FPX can individually reconfigure the packet processing functionality for one set of traffic flows, while the rest of the system continues to operate. The platform simplifies the development and deployment of new hardware-accelerated packet processing circuits. The modular nature of the system allows an active router to migrate functionality from softare plugins to hardware modules.

211 citations


Patent
05 Sep 2001
TL;DR: In this article, a bit stream defining the state of switches of an FPGA is translated into a set of via geometries, or generated from a physical design system, which are then used to customize an array of fixed and/or programmable logic blocks.
Abstract: A method is comprised of translating a bit stream defining the state of switches of an FPGA into a set of via geometries, or generating the set of via geometries directly from a physical design system. The via geometries are used to produce at least one via mask. The via mask is then used in a manufacturing process to customize an array of fixed and/or programmable logic blocks.

203 citations


Patent
13 Apr 2001
TL;DR: In this paper, a voting circuit is used to determine the existence of a faulted circuit in order to eliminate the faulty circuit from the operation of the FPGA without physical addition of redundant circuits.
Abstract: In a field programmable gate array (FPGA) allowing dynamic reconfiguration in time multiplexing fashion, duplicate copies are configured in a time multiplexing manner which are functionally identical to a primary circuit specified for a predetermined FPGA's application. The primary and duplicate circuits are interrogated by a voting circuit which determines the existence of a faulted circuit in order to eliminate the faulted circuit from the operation of the FPGA. In this manner, without physical addition of redundant circuits, fault tolerancy for the FPGA is provided to minimize the cost, weight, volume, heat and energy associated issues of conventional redundance techniques.

202 citations


Journal ArticleDOI
TL;DR: This paper presents pipeline vectorization, a method for synthesizing hardware pipelines based on software vectorizing compilers that improves efficiency and ease of development of hardware designs, particularly for users with little electronics design experience.
Abstract: This paper presents pipeline vectorization, a method for synthesizing hardware pipelines based on software vectorizing compilers. The method improves efficiency and ease of development of hardware designs, particularly for users with little electronics design experience. We propose several loop transformations to customize pipelines to meet hardware resource constraints while maximizing available parallelism. For runtime reconfigurable systems, we apply hardware specialization to increase circuit utilization. Our approach is especially effective for highly repetitive computations in digital signal processor (DSP) and multimedia applications. Case studies using field programmable gate arrays (FPGAs)-based platforms are presented to demonstrate the benefits of our approach and to evaluate tradeoffs between alternative implementations. For instance, the loop-tiling transformation, has been found to improve vectorization performance 30-40 times above a PC-based software implementation, depending on whether runtime reconfiguration (RTR) is used.

185 citations


Book ChapterDOI
14 May 2001
TL;DR: High performance single-chip FPGA implementations of the new Advanced Encryption Standard (AES) algorithm, Rijndael are described, with a novel, generic, parameterisable RIJndael encryptor core capable of supporting varying key sizes.
Abstract: This paper describes high performance single-chip FPGA implementations of the new Advanced Encryption Standard (AES) algorithm, Rijndael. The designs are implemented on the Virtex-E FPGA family of devices. FPGAs have proven to be very effective in implementing encryption algorithms. They provide more flexibility than ASIC implementations and produce higher data-rates than equivalent software implementations. A novel, generic, parameterisable Rijndael encryptor core capable of supporting varying key sizes is presented. The 192-bit key and 256-bit key designs run at data rates of 5.8 Gbits/sec and 5.1 Gbits/sec respectively. The 128-bit key encryptor core has a throughput of 7 Gbits/sec which is 3.5 times faster than similar existing hardware designs and 21 times faster than known software implementations, making it the fastest single-chip FPGA Rijndael encryptor core reported to date. A fully pipelined single-chip 128-bit key Rijndael encryptor/decryptor core is also presented. This design runs at a data rate of 3.2 Gbits/sec on a Xilinx Virtex-E XCV3200E-8-CG1156 FPGA device. There are no known single-chip FPGA implementations of an encryptor/decryptor Rijndael design.

172 citations


Book ChapterDOI
14 May 2001
TL;DR: This work proposes a new elliptic curve processor architecture for the computation of point multiplication for curves defined over fields GF(p) that is a scalable architecture in terms of area and speed specially suited for memory-rich hardware platforms such a field programmable gate arrays (FPGAs).
Abstract: This work proposes a new elliptic curve processor architecture for the computation of point multiplication for curves defined over fields GF(p). This is a scalable architecture in terms of area and speed specially suited for memory-rich hardware platforms such a field programmable gate arrays (FPGAs). This processor uses a new type of high-radix Montgomery multiplier that relies on the precomputation of frequently used values and on the use of multiple processing engines.

163 citations


Book ChapterDOI
14 May 2001
TL;DR: This paper presents an evaluation of the Rijndael cipher from the viewpoint of its implementation in a Field Programmable Devices (FPD) and results obtained are significantly faster than that of other implementations known up to now.
Abstract: This paper presents an evaluation of the Rijndael cipher, the Advanced Encryption Standard winner, from the viewpoint of its implementation in a Field Programmable Devices (FPD). Starting with an analysis of algorithm's general characteristics a general cipher structure is described. Two different methods of Rijndael algorithm mapping to FPD are analyzed and suitability of available FPD families is evaluated. Finally, results of proposed mapping implemented in Altera FLEX, ACEX and APEX FPD are presented and compared with the fastest known Xilinx FPGA implementation. Results obtained are significantly faster than that of other implementations known up to now.

159 citations


Proceedings ArticleDOI
01 Feb 2001
TL;DR: In mapping the k-means algorithm to FPGA hardware, this work examined algorithm level transforms that dramatically increased the achievable parallelism and also examined the effects of using fixed precision and truncated bit widths in the algorithm.
Abstract: In mapping the k-means algorithm to FPGA hardware, we examined algorithm level transforms that dramatically increased the achievable parallelism. We apply the k-means algorithm to multi-spectral and hyper-spectral images, which have tens to hundreds of channels per pixel of data. K-means is an iterative algorithm that assigns assigns to each pixel a label indicating which of K clusters the pixel belongs to.K-means is a common solution to the segmentation of multi-dimensional data. The standard software implementation of k-means uses floating-point arithmetic and Euclidean distances. Floating point arithmetic and the multiplication-heavy Euclidean distance calculation are fine on a general purpose processor, but they have large area and speed penalties when implemented on an FPGA. In order to get the best performance of k-means on an FPGA, the algorithm needs to be transformed to eliminate these operations. We examined the effects of using two other distance measures, Manhattan and Max, that do not require multipliers. We also examined the effects of using fixed precision and truncated bit widths in the algorithm.It is important to explore algorithmic level transforms and tradeoffs when mapping an algorithm to reconfigurable hardware. A direct translation of the standard software implementation of k-means would result in a very inefficient use of FPGA hardware resources. Analysis of the algorithm and data is necessary for a more efficient implementation. Our resulting implementation exhibits approximately a 200 times speed up over a software implementation.

Proceedings ArticleDOI
01 Dec 2001
TL;DR: This work introduces a way how it can achieve high speed homology search by only adding one off-the-shelf PCI board with one Field Programmable Gate Array (FPGA) to a Pentium based computer system in use.
Abstract: We will introduce a way how we can achieve high speed homology search by only adding one off-the-shelf PCI board with one Field Programmable Gate Array (FPGA) to a Pentium based computer system in use. FPGA is a reconfigurable device, and any kind of circuits, such as pattern matching program, can be realized in a moment. The performance is almost proportional to the size of FPGA which is used in the system, and FPGAs are becoming larger and larger following Moore's law. We can easily obtain latest/larger FPGAs in the form off-the-shelf PCI boards with FPGAs, at low costs. The result which we obtained is as follows. The performance is most comparable with small to middle class dedicated hardware systems when we use a board with one of the latest FPGAs and the performance can be furthermore accelerated by using more number of FPGA boards. The time for comparing a query sequence of 2,048 elements with a database sequence of 64 million elements by the Smith-Waterman algorithm is about 34 sec, which is about 330 times faster than a desktop computer with a 1 GHz Pentium III. We can also accelerate the performance of a laptop computer using a PC card with one smaller FPGA. The time for comparing a query sequence (1,024) with the database sequence (64 million) is about 185 sec, which is about 30 times faster than the desktop computer.

Patent
03 Aug 2001
TL;DR: In this article, an interconnection network architecture based on Benes networks is described, which is especially useful for FPGAs and is rearrangeable so that routing between logic cell terminals is guaranteed.
Abstract: An interconnection network architecture which provides an interconnection network which is especially useful for FPGAs is described. Based upon Benes networks, the resulting network interconnect is rearrangeable so that routing between logic cell terminals is guaranteed. Upper limits on time delays for the network interconnect are defined and pipelining for high speed operation is easily implemented. The described network interconnect offers flexibility so that many design options are presented to best suit the desired application.

Proceedings ArticleDOI
22 Jun 2001
TL;DR: This work introduces the first hardware metering scheme that enables reliable low overhead proofs for the number of manufactured parts and establishes the connection between the requirements for hardware and synthesis process.
Abstract: We introduce the first hardware metering scheme that enables reli-able low overhead proofs for the number of manufactured parts. The key idea is to make each design slightly different. Therefore, if two identical hardware designs or a design that is not reported by the foundry are detected, the design house has proof of miscon-duct. We start by establishing the connection between the require-ments for hardware and synthesis process. Furthermore, we present mathematical analysis of statistical accuracy of the pro-posed hardware metering scheme. The effectiveness of the meter-ing scheme is demonstrated on a number of designs.

Journal ArticleDOI
TL;DR: By enabling multiple applications to be dynamically loaded into a single hardware device, the DHP architecture provides a scalable mechanism for implementing high-performance programmable routers.

Book ChapterDOI
08 Apr 2001
TL;DR: A new methodology for a fair comparison of the hardware performance of secret-key block ciphers has been developed and contrasted with methodology used by the NSA team.
Abstract: The results of fast implementations of all five AES final candidates using Virtex Xilinx Field Programmable Gate Arrays are presented and analyzed. Performance of several alternative hardware architectures is discussed and compared. One architecture optimum from the point of view of the throughput to area ratio is selected for each of the two major types of block cipher modes. For feedback cipher modes, all AES candidates have been implemented using the basic iterative architecture, and achieved speeds ranging from 61 Mbit/s for Mars to 431 Mbit/s for Serpent. For non-feedback cipher modes, four AES candidates have been implemented using a high-throughput architecture with pipelining inside and outside of cipher rounds, and achieved speeds ranging from 12.2 Gbit/s for Rijndael to 16.8 Gbit/s for Serpent. A new methodology for a fair comparison of the hardware performance of secret-key block ciphers has been developed and contrasted with methodology used by the NSA team.

Proceedings Article
01 Jan 2001
TL;DR: This work introduces the first diagnosis method for multiple faulty PLBs; for any faulty PLB, it is introduced its internal faulty modules or modes of operation and provides the basis for both failure analysis used for yield improvement and for any repair strategy used for fault-tolerance in reconfigurable systems.
Abstract: We present a built-in self-test (BIST) approach able to detect and accurately diagnose all single and practically all multiple faulty programmable logic blocks (PLBs) in field programmable gate arrays (FPGAs) with maximum diagnostic resolution. Unlike conventional BIST, FPGA BIST does not involve any area overhead or performance degradation. We also identify and solve the problem of testing configuration multiplexers that was either ignored or incorrectly solved in most previous work. We introduce the first diagnosis method for multiple faulty PLBs; for any faulty PLB, we also identify its internal faulty modules or modes of operation. Our accurate diagnosis provides the basis for both failure analysis used for yield improvement and for any repair strategy used for fault-tolerance in reconfigurable systems. We present experimental results showing detection and identification of faulty PLBs in actual defective FPGAs. Our BIST architecture is easily scalable.

Proceedings ArticleDOI
27 May 2001
TL;DR: This work proposes a hardware implementation of the Compact Genetic Algorithm using the Verilog hardware description language (HDL) and then fabricated on an FPGA, which runs about 1000 times faster than the software executing on a workstation.
Abstract: We propose a hardware implementation of the Compact Genetic Algorithm (GA). The design is realized using the Verilog hardware description language (HDL) and then fabricated on an FPGA. Our design, though simple, runs about 1000 times faster than the software executing on a workstation. An alternative hardware for linkage learning is also proposed in order to enhance the capability of the Compact GA to solve highly deceptive problems.

Journal ArticleDOI
TL;DR: In this article, the authors present a built-in self-test (BIST) approach able to diagnose all single and practically all multiple faulty programmable logic blocks (PLBs) in field programmable gate arrays (FPGAs) with maximum diagnostic resolution.
Abstract: We present a built-in self-test (BIST) approach able to detect and accurately diagnose all single and practically all multiple faulty programmable logic blocks (PLBs) in field programmable gate arrays (FPGAs) with maximum diagnostic resolution. Unlike conventional BIST, FPGA BIST does not involve any area overhead or performance degradation. We also identify and solve the problem of testing configuration multiplexers that was either ignored or incorrectly solved in most previous work. We introduce the first diagnosis method for multiple faulty PLBs; for any faulty PLB, we also identify its internal faulty modules or modes of operation. Our accurate diagnosis provides the basis for both failure analysis used for yield improvement and for any repair strategy used for fault-tolerance in reconfigurable systems. We present experimental results showing detection and identification of faulty PLBs in actual defective FPGAs. Our BIST architecture is easily scalable.

Patent
19 Jan 2001
TL;DR: In this paper, a high-density mount board has external mounting pins on the bottom surface so as to be mounted on a mother board in the same manner as a system on-chip multi-chip module.
Abstract: An electronic circuit device has a high-density mount board, on which are disposed a microcomputer, a random access memory, a programmable device which is a variable logic circuit represented by FPGA, and an electrically-rewritable nonvolatile memory which can store the operation program of the microcomputer. The high-density mount board has external mounting pins on the bottom surface so as to be mounted on a mother board in the same manner as a system on-chip multi-chip module. With an intended logic function being set on the programmable device, a hardware-based function to be realized by the electronic circuit device is simulated. With an operation program being written to the nonvolatile memory, a software-based function to be realized is simulated. Consequently, the device facilitates the debugging at early stages of system development, configures a prototype system, and contributes to the time reduction throughout the system development, prototype fabrication and large-scale production.

Patent
05 Jan 2001
TL;DR: In this article, the authors propose to provide multiple configuration programs to support multiple configuration modes using the same processor hardware, which can be stored in metal-mask ROM on-chip so they can be changed without re-laying out the remainder of the FPGA.
Abstract: An FPGA has an on-chip processor that reads configuration data onto the FPGA and controls the loading of that configuration data into FPGA configuration memory cells. After FPGA power-up, the processor reads a configuration mode code from predetermined terminals of the FPGA. If the configuration mode code has a first value, then the processor executes a first configuration program so that configuration data is received onto the FPGA in accordance with a first configuration mode. If the configuration mode code has a second value, then the processor executes a second configuration program so that configuration data is received onto the FPGA in accordance with a second configuration mode. The configuration programs can be stored in metal-mask ROM on-chip so they can be changed without re-laying out the remainder of the FPGA. Providing multiple configuration programs allows the FPGA to support multiple configuration modes using the same processor hardware. One configuration mode code causes the processor to execute a loader program that in turn loads a configuration program onto the FPGA from a source external to the FPGA. Once the configuration program is loaded, the processor executes the configuration program thereby allowing the FPGA to support a custom configuration mode.

Proceedings ArticleDOI
29 Apr 2001
TL;DR: This work has extensively researched the current compression techniques, including the Huffman coding, the Arithmetic coding and LZ coding, and developed different algorithms targeting different hardware structures and demonstrates that a factor of 4 compression ratio can be achieved.
Abstract: Although run-time reconfigurable systems have been shown to achieve very high performance, the speedups over traditional microprocessor systems are limited by the cost of configuration of the hardware. Current reconfigurable systems suffer from a significant overhead due to the time it takes to reconfigure their hardware. In order to deal with this overhead, and increase the compute power of reconfigurable systems, it is important to develop hardware and software systems to reduce or eliminate this delay. In this paper, we explore the idea of configuration compression and develop algorithms for reconfigurable systems. These algorithms, targeted to Xilinx Virtex series FPGAs with minimum modification of hardware, can significantly reduce the amount of data needed to transfer during configuration. In this work we have extensively researched the current compression techniques, including the Huffman coding, the Arithmetic coding and LZ coding. We have also developed different algorithms targeting different hardware structures. Our readback algorithm allows certain frames to be reused as a dictionary and sufficiently utilize the regularities within the configuration bitstream. In addition, we have developed frame reordering techniques that better uses the regularities by shuffling the sequence of the configuration. We have also developed the wildcard approach that can be used for true partial reconfiguration. The simulation results demonstrate that a factor of 4 compression ratio can be achieved.

DOI
01 Jan 2001
TL;DR: A tool called PARBIT has been developed that transforms FPGA configuration bitfiles to enable DHP modules and it is possible to define a partial reconfigurable area inside the FPGAs and download it into a specified region of the FGPA device.
Abstract: Field Programmable Gate Arrays (FPGAs) can be partially reconfigured to implement Dynamically loadable Hardware Plugin (DHP) modules. A tool called PARBIT has been developed that transforms FPGA configuration bitfiles to enable DHP modules. With this tool it is possible to define a partial reconfigurable area inside the FPGA and download it into a specified region of the FPGA device. One or more DHPs, with different sizes can be implemented using PARBIT.

Journal ArticleDOI
TL;DR: In this paper, an evolution-oriented field programmable transistor array (FPTA) is proposed, which allows evolutionary experiments with reconfiguration at various levels of granularity and can be used to automatically synthesize a variety of analog and digital circuits.
Abstract: Evolvable hardware (EHW) addresses on-chip adaptation and self-configuration through evolutionary algorithms. Current programmable devices, in particular the analog ones, lack evolution-oriented characteristics. This paper proposes an evolution-oriented field programmable transistor array (FPTA), reconfigurable at transistor level. The FPTA allows evolutionary experiments with reconfiguration at various levels of granularity. Experiments in SPICE simulations and directly on a reconfigurable FPTA chip demonstrate how the evolutionary approach can be used to automatically synthesize a variety of analog and digital circuits.

Journal ArticleDOI
TL;DR: This work presents the first technique that leverages the unique characteristics of field-programmable gate arrays (FPGAs) to protect commercial investment in intellectual property through fingerprinting.
Abstract: As current computer-aided design (CAD) tool and very large scale integration technology capabilities create a new market of reusable digital designs, the economic viability of this new core-based design paradigm is pending on the development of techniques for intellectual property protection. This work presents the first technique that leverages the unique characteristics of field-programmable gate arrays (FPGAs) to protect commercial investment in intellectual property through fingerprinting. A hidden encrypted mark is embedded into the physical layout of a digital circuit when it is placed and routed onto the FPGA. This mark uniquely identifies both the circuit origin and original circuit recipient, yet is difficult to detect and/or remove, even via recipient collusion. While this approach imposes additional constraints on the backend CAD tools for circuit place and route, experiments indicate that the performance and area impacts are minimal.

Proceedings ArticleDOI
13 Mar 2001
TL;DR: A compiler that takes high level signal and image processing algorithms described in MATLAB and generates an optimized hardware for an FPGA with external memory and a combined precision and error analysis algorithm to infer the minimum number of bits required by a floating point variable is presented.
Abstract: We present a compiler that takes high level signal and image processing algorithms described in MATLAB and generates an optimized hardware for an FPGA with external memory. We propose a precision analysis algorithm to determine the minimum number of bits required by an integer variable and a combined precision and error analysis algorithm to infer the minimum number of bits required by a floating point variable. Our results show that on average, our algorithms generate hardware requiring a factor of 5 less FPGA resources in terms of the configurable logic blocks (CLBs) consumed as compared to the hardware generated without these optimizations. We show that our analysis results in the reduction in the size of lookup tables for functions like sin, cos, sqrt, exp etc. Our precision analysis also enables us to pack various array elements into a single memory location to reduce the number external memory accesses. We show that such a technique improves the performance of the generated hardware by an average of 35%.

Proceedings ArticleDOI
26 Sep 2001
TL;DR: An FPGA Rijndael encryption design is presented, which utilizes look-up tables to implement the entire RIJndael Round function, which achieves a speed of 12 Gbits/sec, which is a factor 1.2 times faster than an alternative design.
Abstract: An FPGA Rijndael encryption design is presented, which utilizes look-up tables to implement the entire Rijndael Round function. A comparison is provided between this design and similar existing implementations. Hardware implementations of encryption algorithms prove much faster than equivalent software implementations and since there is a need to perform encryption on data in real time, speed is very important. In particular, field programmable gate arrays (FPGAs) are well suited to encryption implementations due to their flexibility and an architecture, which can be exploited to accommodate typical encryption transformations. A look-up table based Rijndael design achieves a speed of 12 Gbits/sec, which is a factor 1.2 times faster than an alternative design in which look-up tables are utilized to implement only one of the Round function transformations, and 6 times faster than other previous implementations.

Patent
08 Feb 2001
TL;DR: In this article, a field programmable gate array has security configuration features to prevent monitoring of the configuration data for the FPGA array, which is stored in an external nonvolatile memory.
Abstract: A field programmable gate array has security configuration features to prevent monitoring of the configuration data for the field programmable gate array. The configuration data is encrypted by a security circuit of the field programmable gate array using a security key. This encrypted configuration data is stored in an external nonvolatile memory. To configure the field programmable gate array, the encrypted configuration data is decrypted by the security circuit of the field programmable gate array using the security key stored in the artwork of the field programmable gate array. The secret key consists of a number of bits of key information that are embedded within the photomasks used in manufacture the FPGA chip.

Journal ArticleDOI
TL;DR: The construction of physically homogeneous, undifferentiated hardware that is later, after manufacture, differentiated into various digital circuits achieves both the immediate goal of achieving specific CPU and memory architectures using atomic-scale switches as well as the larger goal of being able to construct any digital circuit, using the same fixed manufacturing process.
Abstract: Much effort has been put into the development of atomic-scale switches and the construction of computers from atomic-scale components. We propose the construction of physically homogeneous, undifferentiated hardware that is later, after manufacture, differentiated into various digital circuits. This achieves both the immediate goal of achieving specific CPU and memory architectures using atomic-scale switches as well as the larger goal of being able to construct any digital circuit, using the same fixed manufacturing process. Moreover, this opens the way to implementing fundamentally new types of circuit, including dynamic, massively parallel, self-modifying ones. Additionally, the specific architecture in question is not particularly complex, making it easier to construct than most other architectures. We have developed a computing architecture, the Cell MatrixTM, that fits this more attainable manufacturing goal, as well as a process for taking undifferentiated hardware and differentiating it efficiently and cheaply into desirable circuitry. The Cell Matrix is based on a single atomic unit called a cell, which is repeated over and over to form a multidimensional matrix of cells. In addition to being general purpose, the architecture is highly scalable, so much so that it appears to provide access to the differentiation and use of trillion trillion switch hardware. This is not possible with a field programmable gate array architecture, because its gate array is configured serially, and serial configuration of trillion trillion switch hardware would take years. This paper describes the cell in detail and describes how networks of cells in a matrix are used to create small circuits. It also describes a sample application of the architecture that makes beneficial use of high switch counts.

Patent
25 May 2001
TL;DR: A bus interface circuit for a programmable logic device (PLD) including an interface multiplexer connected between two or more external communication circuits and a configuration memory array is described in this article.
Abstract: A bus interface circuit for a programmable logic device (PLD) including an interface multiplexer connected between two or more external communication circuits and a configuration memory array. The interface multiplexer coordinates communication between a selected one of the external communication circuits and a packet processor. The packet processor interprets command/data information transmitted in a bit stream from the selected external communication circuit. In a default state, the interface multiplexer connects dual-purpose input/output pins of the PLD to the packet processor. In an alternative state, the interface multiplexer connects a JTAG interface circuit to the packet processor to facilitate configuration operations through the JTAG pins of the PLD.