scispace - formally typeset
Open AccessJournal ArticleDOI

Run-time generation of partial FPGA configurations

Miguel L. Silva, +1 more
- 01 Jan 2012 - 
- Vol. 58, Iss: 1, pp 24-37
Reads0
Chats0
TLDR
The method is intended for use in adaptive embedded systems that employ run-time reconfiguration to achieve high flexibility and performance and is embodied in a code library that applications can use to create new bitstreams at run- time.
About
This article is published in Journal of Systems Architecture.The article was published on 2012-01-01 and is currently open access. It has received 3 citations till now. The article focuses on the topics: Netlist & Bitstream.

read more

Citations
More filters
Journal ArticleDOI

Fast and standalone Design Space Exploration for High-Level Synthesis under resource constraints

TL;DR: A new methodology for Design Space Exploration (DSE) in the context of High-Level Synthesis (HLS) for HPC and embedded systems targeting FPGAs, which shows a high generation time speed-up compared to one other existing HLS approach, while preserving correct performance of the generated circuits.
Proceedings ArticleDOI

A Fast and Autonomous HLS Methodology for Hardware Accelerator Generation under Resource Constraints

TL;DR: A new methodology for hardware accelerator generation, in the context of High Level Synthesis (HLS) for Field Programmable Gate Array (FPGA) components, which shows a high generation speed-up compared to other existing HLS approaches, while preserving correct performance of the generated circuits.

Generation of Custom Run-Time Reconfigurable Hardware for Transparent Binary Acceleration

Nuno Paulino
TL;DR: This work designed and evaluated a transparent binary acceleration approach, targeting Field Programmable Gate Array devices, which relies on instruction traces to automatically generate specialized accelerator instances, and which is capable of expediently generating accelerator-augmented embedded systems which achieve considerable performance increases whilst incurring a low resource cost, and without requiring manual hardware design.
References
More filters
Journal ArticleDOI

Optimization by Simulated Annealing

TL;DR: There is a deep and useful connection between statistical mechanics and multivariate or combinatorial optimization (finding the minimum of a given function depending on many parameters), and a detailed analogy with annealing in solids provides a framework for optimization of very large and complex systems.
Proceedings ArticleDOI

MediaBench: a tool for evaluating and synthesizing multimedia and communications systems

TL;DR: The MediaBench benchmark suite as discussed by the authors is a benchmark suite that has been designed to fill the gap between the compiler community and embedded applications developers, which has been constructed through a three-step process: intuition and market driven initial selection, experimental measurement, and integration with system synthesis algorithms to establish usefulness.
Book

Architecture and CAD for Deep-Submicron FPGAS

TL;DR: From the Publisher: Architecture and CAD for Deep-Submicron FPGAs addresses several key issues in the design of high-performance FPGA architectures and CAD tools, with particular emphasis on issues that are important for FPG as implemented in deep-submicron processes.
Proceedings ArticleDOI

Invited Paper: Enhanced Architectures, Design Methodologies and CAD Tools for Dynamic Reconfiguration of Xilinx FPGAs

TL;DR: In this article, the authors describe architectural enhancements to Xilinx FPGAs that provide better support for the creation of dynamically reconfigurable designs, augmented by a new design methodology that uses pre-routed IP cores for communication between static and dynamic modules and permits static designs to route through regions otherwise reserved for dynamic modules.
Proceedings ArticleDOI

Dynamic hardware plugins in an FPGA with partial run-time reconfiguration

TL;DR: Tools and a design methodology have been developed to support partial run-time reconfiguration of FPGA logic on the Field Programmable Port Extender to support high-speed Internet packet processing circuits on this platform.
Related Papers (5)
Frequently Asked Questions (16)
Q1. What are the contributions in "Run-time generation of partial fpga configurations" ?

This paper presents and evaluates a method of generating partial bitstreams at run-time for dynamic reconfiguration of sections of an FPGA. 

The standard synthesis flow produces a full bitstream for each component, which must be post-processed in order to extract the relevant partial bitstream and to add more information for use at run-time. 

A router for just-in-time mapping of a device-independent configuration description to a specific device architecture is described by [21]: it is able to produce good hardware circuits using 13 times less memory and executing 10 times faster than VPR [22] (running on a desktop computer). 

The creation of the new bitstreams requires assigning positions of the reconfigurable area to components (placement), relocating and merging the individual component bitstreams, and interconnecting the components (routing) by modification of the merged bitstream. 

The configuration memory controller is responsible for handling the actual transfer of the bitstreams to the configuration memory through the internal configuration access port (ICAP). 

Since routes can cross stripes by going through the components or by using empty spaces, a stripe density value is calculated, which is given by the percentage of occupied CLBs. 

At the beginning of the first step, the bitstream of the empty dynamic area is used to initialize the working array of configuration information. 

With the current system, the physical interface adaptation might be performed at run-time by adding appropriate ‘‘glue’’ components to the design. 

The results for a set of 29 benchmarks (both synthetic and application-derived) show that time required for bitstream generation on a 300 MHz PowerPC embedded processor depends strongly on the complexity of the circuits, but is under 35 s for all benchmarks (average: 18 s). 

In general, the route reuse heuristic is very cost effective, because it provides an improvement in running time at a very low implementation cost. 

Procedure LinkPins (Alg. 3) is used to find a path from a source pin to one sink pin: it performs a breadth-first search for the shortest path between the source and sink terminals (line 4). 

In this work, the creation of a new partial bitstream involves three major tasks: placement, routing, and bitstream construction (Fig. 2). 

The algorithm stops after a predefined number of temperature decreases (currently, 50), or if it fails to make any improvement for a fixed number of successive temperature decreases (currently, 10). 

The working implementation described here shows that runtime generation of configurations is a feasible technique for use on highly adaptive embedded systems, where it may be used to provide precisely-tailored hardware support for tasks whose computational needs exceed the computational power of the CPU. 

The data of Table 3 show that the most significant improvement in routing time occurs for benchmark H8: with SA2D the routing time decreases 19.3%, for a global improvement of 15.9%. 

The average length of the connections is also systematically improved, in the best cases by more than one segment: M2 shows a reduction of 18.7% from 5.67 to 4.61 for the average length.