Showing papers in "International Journal of Reconfigurable Computing in 2008"

PDF

Open Access

Journal Article•DOI•

Architecture-Level Exploration of Alternative Interconnection Schemes Targeting 3D FPGAs: A Software-Supported Methodology

[...]

Kostas Siozios¹, Alexandros Bartzas, Dimitrios Soudris•Institutions (1)

Democritus University of Thrace¹

01 Jan 2008-International Journal of Reconfigurable Computing

TL;DR: This paper proposes a software-supported methodology for exploring and evaluating alternative interconnection schemes for 3D FPGAs, and achieves higher utilization ratio for the vertical interconnections compared to existing approaches by 8%, leading to cheaper and more reliable devices.

...read moreread less

Abstract: In current reconfigurable architectures, the interconnection structures increasingly contribute more to the delay and power consumption. The demand for increased clock frequencies and logic density (smaller area footprint) makes the problem even more important. Three-dimensional (3D) architectures are able to alleviate this problem by accommodating a number of functional layers, each of which might be fabricated in different technology. However, the benefits of such integration technology have not been sufficiently explored yet. In this paper, we propose a software-supported methodology for exploring and evaluating alternative interconnection schemes for 3D FPGAs. In order to support the proposed methodology, three new CAD tools were developed (part of the 3D MEANDER Design Framework). During our exploration, we study the impact of vertical interconnection between functional layers in a number of design parameters. More specifically, the average gains in operation frequency, power consumption, and wirelength are 35%, 32%, and 13%, respectively, compared to existing 2D FPGAs with identical logic resources. Also, we achieve higher utilization ratio for the vertical interconnections compared to existing approaches by 8% for designing 3D FPGAs, leading to cheaper and more reliable devices.

...read moreread less

45 citations

Journal Article•DOI•

FPGA-Based Embedded Motion Estimation Sensor

[...]

Zhaoyi Wei, Dah-Jye Lee, Brent Nelson, James Archibald, Barrett B. Edwards - Show less +1 more

15 Jul 2008-International Journal of Reconfigurable Computing

TL;DR: A more accurate optical flow algorithm is proposed which is able to process 15 frames of image per second and with much improved accuracy and can be achieved with further optimization and additional memory space.

...read moreread less

Abstract: Accurate real-time motion estimation is very critical to many computer vision tasks. However, because of its computational power and processing speed requirements, it is rarely used for real-time applications, especially for micro unmanned vehicles. In our previous work, a FPGA system was built to process optical flow vectors of 64 frames of image per second. Compared to software-based algorithms, this system achieved much higher frame rate but marginal accuracy. In this paper, a more accurate optical flow algorithm is proposed. Temporal smoothing is incorporated in the hardware structure which significantly improves the algorithm accuracy. To accommodate temporal smoothing, the hardware structure is composed of two parts: the derivative (DER) module produces intermediate results and the optical flow computation (OFC) module calculates the final optical flow vectors. Software running on a built-in processor on the FPGA chip is used in the design to direct the data flow and manage hardware components. This new design has been implemented on a compact, low power, high performance hardware platform for micro UV applications. It is able to process 15 frames of image per second and with much improved accuracy. Higher frame rate can be achieved with further optimization and additional memory space.

...read moreread less

23 citations

Journal Article•DOI•

On the use of magnetic RAMs in field-programmable gate arrays

[...]

Yoann Guillemenet¹, Lionel Torres¹, Gilles Sassatelli¹, N. Bruchon²•Institutions (2)

University of Montpellier¹, Areva²

01 Jan 2008-International Journal of Reconfigurable Computing

TL;DR: This paper describes the integration of field-induced magnetic switching (FIMS) and thermally assisted switching (TAS) magnetic random access memories in FPGA design and suggests a real-time reconfigurable micro-FPGA using FIMS- MRAM or TAS-MRAM allows dynamic reconfiguration mechanisms, while featuring simple design architecture.

...read moreread less

Abstract: This paper describes the integration of field-induced magnetic switching (FIMS) and thermally assisted switching (TAS) magnetic random access memories in FPGA design. The nonvolatility of the latter is achieved through the use of magnetic tunneling junctions (MTJs) in the MRAM cell. A thermally assisted switching scheme helps to reduce power consumption during write operation in comparison to the writing scheme in the FIMS-MTJ device. Moreover, the nonvolatility of such a design based on either an FIMS or a TAS writing scheme should reduce both power consumption and configuration time required at each power up of the circuit in comparison to classical SRAM-based FPGAs. A real-time reconfigurable (RTR) micro-FPGA using FIMS-MRAM or TAS-MRAM allows dynamic reconfiguration mechanisms, while featuring simple design architecture.

...read moreread less

23 citations

Journal Article•DOI•

Burst-Mode Asynchronous Controllers on FPGA

[...]

Duarte L. Oliveira¹, Marius Strum, Sandro Shoiti Sato•Institutions (1)

Instituto Tecnológico de Aeronáutica¹

01 Jan 2008-International Journal of Reconfigurable Computing

TL;DR: This work proposes a method that implements a popular class of asynchronous circuits, known as burst mode, on FPGAs based on look-up table architectures, and presents two conditions that guarantee essential hazard-free implementation on any LUT-based FPGA.

...read moreread less

Abstract: FPGAs have been mainly used to design synchronous circuits. Asynchronous design on FPGAs is difficult because the resulting circuit may suffer from hazard problems. We propose a method that implements a popular class of asynchronous circuits, known as burst mode, on FPGAs based on look-up table architectures. We present two conditions that, if satisfied, guarantee essential hazard-free implementation on any LUT-based FPGA. By doing that, besides all the intrinsic advantages of asynchronous over synchronous circuits, they also take advantage of the shorter design time and lower cost associated with FPGA designs.

...read moreread less

16 citations

Journal Article•DOI•

A Game-Theoretic Approach for Run-Time Distributed Optimization on MP-SoC

[...]

Diego Puschini¹, Fabien Clermidy¹, Pascal Benoit, Gilles Sassatelli, Lionel Torres - Show less +1 more•Institutions (1)

Commissariat à l'énergie atomique et aux énergies alternatives¹

29 Sep 2008-International Journal of Reconfigurable Computing

TL;DR: A scalable multiobjective approach based on game theory, which adjusts at run-time the frequency of each PE, which aims at reducing the tile temperature while maintaining the synchronization between application tasks is introduced.

...read moreread less

Abstract: With forecasted hundreds of processing elements (PEs), future embedded systems will be able to handle multiple applications with very diverse running constraints. Systems will integrate distributed decision capabilities. In order to control the power and temperature, dynamic voltage frequency scalings (DVFSs) are applied at PE level. At system level, it implies to dynamically manage the different voltage/frequency couples of each tile to obtain a global optimization. This paper introduces a scalable multiobjective approach based on game theory, which adjusts at run-time the frequency of each PE. It aims at reducing the tile temperature while maintaining the synchronization between application tasks. Results show that the proposed run-time algorithm requires an average of 20 calculation cycles to find the solution for a 100-processor platform and reaches equivalent performances when comparing with an offline method. Temperature reductions of about 23% were achieved on a demonstrative test-case.

...read moreread less

13 citations

Journal Article•DOI•

The Coarse-Grained/Fine-Grained Logic Interface in FPGAs with Embedded Floating-Point Arithmetic Units

[...]

Chi Wai Yu, Julien Lamoureux, Steven J. E. Wilton, Philip H. W. Leong, Wayne Luk - Show less +1 more

01 Jan 2008-International Journal of Reconfigurable Computing

TL;DR: An empirical study that covers the location, pin arrangement, and interconnect between embedded floating point units (FPUs) and the fine-grained logic fabric in FPGAs shows that FPUs should be square, FPU's should be positioned tightly near the center of the FPGA and that the FPU pins should be arranged on four sides of theFPU.

...read moreread less

Abstract: This paper examines the interface between fine-grained and coarse-grained programmable logic in FPGAs. Specifically, it presents an empirical study that covers the location, pin arrangement, and interconnect between embedded floating point units (FPUs) and the fine-grained logic fabric in FPGAs. It also studies this interface in FPGAs which contain both FPUs and embedded memories. The results show that (1) FPUs should have a square aspect ratio; (2) they should be positioned near the center of the FPGA; (3) their I/O pins should be arranged around all four sides of the FPU; (4) embedded memory should be located between the FPUs; and (5) connecting higher I/O density coarse-grained blocks increases the demand for routing resources. The hybrid FPGAs with embedded memory required 12% wider channels than the case where embedded memory is not used.

...read moreread less

11 citations

Journal Article•DOI•

An Embedded Reconfigurable IP Core with Variable Grain Logic Cell Architecture

[...]

Motoki Amagasaki¹, Ryoichi Yamaguchi, Masahiro Koga, Masahiro Iida, Toshinori Sueyoshi - Show less +1 more•Institutions (1)

Kumamoto University¹

20 Oct 2008-International Journal of Reconfigurable Computing

TL;DR: This paper proposes a variable grain logic cell (VGLC) architecture, which consists of a 4-bit ripple carry adder with configuration memory bits and develops a technology mapping tool, which improves logic depth and reduces the number of configuration data by 55% on average, as compared to the Virtex-4 logic cell architecture.

...read moreread less

Abstract: Reconfigurable logic devices (RLDs) are classified as the fine-grained or coarse-grained type based on their basic logic cell architecture. In general, each architecture has its own advantage. Therefore, it is difficult to achieve a balance between the operation speed and implementation area in various applications. In the present paper, we propose a variable grain logic cell (VGLC) architecture, which consists of a 4-bit ripple carry adder with configuration memory bits and develop a technology mapping tool. The key feature of the VGLC architecture is that the variable granularity is a tradeoff between coarse-grained and fine-grained types required for the implementation arithmetic and random logic, respectively. Finally, we evaluate the proposed logic cell using the newly developed technology mapping tool, which improves logic depth by 31% and reduces the number of configuration data by 55% on average, as compared to the Virtex-4 logic cell architecture.

...read moreread less

10 citations

Journal Article•DOI•

Dynamic Hardware Development

[...]

Stephen Craven¹, Peter Athanas•Institutions (1)

University of Tennessee at Chattanooga¹

15 Oct 2008-International Journal of Reconfigurable Computing

TL;DR: This paper discusses the creation of a high-level development environment for reconfigurable designs that leverage an existing high- level synthesis tool to enable the design, simulation, and implementation of dynamically reconfiguring hardware solely from a specification written in C.

...read moreread less

Abstract: Applications that leverage the dynamic partial reconfigurability of modern FPGAs are few, owing in large part to the lack of suitable tools and techniques to create them. While the trend in digital design is towards higher levels of design abstractions, forgoing hardware description languages in some cases for high-level languages, the development of a reconfigurable design requires developers to work at a low level and contend with many poorly documented architecture-specific aspects. This paper discusses the creation of a high-level development environment for reconfigurable designs that leverage an existing high-level synthesis tool to enable the design, simulation, and implementation of dynamically reconfigurable hardware solely from a specification written in C. Unlike previous attempts, this approach encompasses the entirety of design and implementation, enables self-re-configuration through an embedded controller, and inherently handles partial reconfiguration. Benchmarking numbers are provided, which validate the productivity enhancements this approach provides.

...read moreread less

7 citations

Journal Article•DOI•

Multiobjective Optimization for Reconfigurable Implementation of Medical Image Registration

[...]

Omkar Dandekar, William Plishker, Shuvra S. Bhattacharyya, Raj Shekhar¹•Institutions (1)

University of Maryland Medical Center¹

01 Jan 2008-International Journal of Reconfigurable Computing

TL;DR: A novel multiobjective wordlength optimization strategy developed through FPGA-based implementation of a representative computationally intensive image processing application: medical image registration is presented and may be adapted to a wide range of signal processing applications.

...read moreread less

Abstract: In real-time signal processing, a single application often has multiple computationally intensive kernels that can benefit from acceleration using custom or reconfigurable hardware platforms, such as field-programmable gate arrays (FPGAs). For adaptive utilization of resources at run time, FPGAs with capabilities for dynamic reconfiguration are emerging. In this context, it is useful for designers to derive sets of efficient configurations that trade off application performance with fabric resources. Such sets can be maintained at run time so that the best available design tradeoff is used. Finding a single, optimized configuration is difficult, and generating a family of optimized configurations suitable for different run-time scenarios is even more challenging. We present a novel multiobjective wordlength optimization strategy developed through FPGA-based implementation of a representative computationally intensive image processing application: medical image registration. Tradeoffs between FPGA resources and implementation accuracy are explored, and Pareto-optimized wordlength configurations are systematically identified. We also compare search methods for finding Pareto-optimized design configurations and demonstrate the applicability of search based on evolutionary techniques for identifying superior multiobjective tradeoff curves. We demonstrate feasibility of this approach in the context of FPGA-based medical image registration; however, it may be adapted to a wide range of signal processing applications.

...read moreread less

7 citations

Journal Article•DOI•

SystemC Transaction-Level Modeling of an MPSoC Platform Based on an Open Source ISS by Using Interprocess Communication

[...]

S. Boukhechem¹, El-Bay Bourennane•Institutions (1)

University of Burgundy¹

29 Sep 2008-International Journal of Reconfigurable Computing

TL;DR: The aim of this work is to provide designers with the possibility of faster and efficient architecture exploration at a higher level of abstractions, starting from an algorithmic description to implementation details.

...read moreread less

Abstract: Transaction-level modeling (TLM) is a promising technique to deal with the increasing complexity of modern embedded systems. This model allows a system designer to model a complete application, composed of hardware and software parts, at several levels of abstraction. For this purpose, we use systemC, which is proposed as a standardized modeling language. This paper presents a transaction-level modeling cosimulation methodology for modeling, validating, and verifying our embedded open architecture platform. The proposed platform is an open source multiprocessor system-on-chip (MPSoC) platform, integrated under the synthesis tool for adaptive and reconfigurable system-on-chip (STARSoC) environment. It relies on the integration between an open source instruction set simulators (ISSs), OR1Ksim platform, and the systemC simulation environment which contains other components (wishbone bus, memories, …, etc.). The aim of this work is to provide designers with the possibility of faster and efficient architecture exploration at a higher level of abstractions, starting from an algorithmic description to implementation details.

...read moreread less

6 citations

Journal Article•DOI•

Neuromorphic Configurable Architecture for Robust Motion Estimation

[...]

Guillermo Botella¹, Manuel Rodríguez, Antonio García, Eduardo Ros•Institutions (1)

Complutense University of Madrid¹

01 Jan 2008-International Journal of Reconfigurable Computing

TL;DR: This work is the efficient implementation of a biologically inspired motion algorithm that borrows nature templates as inspiration in the design of architectures and makes use of a specific model of human visual motion perception: Multichannel Gradient Model (McGM).

...read moreread less

Abstract: The robustness of the human visual system recovering motion estimation in almost any visual situation is enviable, performing enormous calculation tasks continuously, robustly, efficiently, and effortlessly. There is obviously a great deal we can learn from our own visual system. Currently, there are several optical flow algorithms, although none of them deals efficiently with noise, illumination changes, second-order motion, occlusions, and so on. The main contribution of this work is the efficient implementation of a biologically inspired motion algorithm that borrows nature templates as inspiration in the design of architectures and makes use of a specific model of human visual motion perception: Multichannel Gradient Model (McGM). This novel customizable architecture of a neuromorphic robust optical flow can be constructed with FPGA or ASIC device using properties of the cortical motion pathway, constituting a useful framework for building future complex bioinspired systems running in real time with high computational complexity. This work includes the resource usage and performance data, and the comparison with actual systems. This hardware has many application fields like object recognition, navigation, or tracking in difficult environments due to its bioinspired and robustness properties.

...read moreread less

Journal Article•DOI•

Design of a Mathematical Unit in FPGA for the Implementation of the Control of a Magnetic Levitation System

[...]

Juan José Raygoza-Panduro¹, Susana Ortega-Cisneros, Jorge Rivera, Alberto de la Mora•Institutions (1)

University of Guadalajara¹

30 Dec 2008-International Journal of Reconfigurable Computing

TL;DR: The design and implementation of an automatically generated mathematical unit, from a program developed in Java that describes the VHDL circuit, ready to be synthesized with the Xilinx ISE tool, is presented.

...read moreread less

Abstract: This paper presents the design and implementation of an automatically generated mathematical unit, from a program developed in Java that describes the VHDL circuit, ready to be synthesized with the Xilinx ISE tool. The core contains diverse complex operations such as mathematical functions including sine and cosine, among others. The proposed unit is used to synthesize a sliding mode controller for a magnetic levitation system. This kind of systems is used in industrial applications requiring high level of mathematical calculations in small time periods. The core is designed to calculate trigonometric and arithmetic operations in such a way that each function is performed in a clock cycle. In this paper, the results of the mathematical core are shown in terms of implementation, utilization, and application to control a magnetic levitation system.

...read moreread less

Journal Article•DOI•

On the power dissipation of embedded memory blocks used to implement logic in field-programmable gate arrays

[...]

Scott Y. L. Chin¹, Clarence S. P. Lee¹, Steven J. E. Wilton¹•Institutions (1)

University of British Columbia¹

01 Jan 2008-International Journal of Reconfigurable Computing

TL;DR: It is shown that although embedded memories provide area efficient implementations of many circuits, this technique results in additional power consumption, and blocks containing smaller-memory arrays are more power efficient than those containing large arrays, but for most array sizes, the memory blocks should be as flexible as possible.

...read moreread less

Abstract: We investigate the power and energy implications of using embedded FPGA memory blocks to implement logic. Previous studies have shown that this technique provides extremely dense implementations of some types of logic circuits, however, these previous studies did not evaluate the impact on power. In this paper, we measure the effects on power and energy as a function of three architectural parameters: the number of available memory blocks, the size of the memory blocks, and the flexibility of the memory blocks. We show that although embedded memories provide area efficient implementations of many circuits, this technique results in additional power consumption. We also show that blocks containing smaller-memory arrays are more power efficient than those containing large arrays, but for most array sizes, the memory blocks should be as flexible as possible. Finally, we show that by combining physical arrays into larger logical memories, and mapping logic in such a way that some physical arrays can be disabled on each access, can reduce the power consumption penalty. The results were obtained from place and routed circuits using standard experimental physical design tools and a detailed power model. Several results were also verified through current measurements on a 0.13 µm CMOS FPGA.

...read moreread less

Journal Article•DOI•

Area optimisation for field-programmable gate arrays in SystemC hardware compilation

[...]

Johan Ditmar¹, Steve McKeever¹, Alex Wilson•Institutions (1)

University of Oxford¹

01 Feb 2008-International Journal of Reconfigurable Computing

TL;DR: A pair of synthesis algorithms that optimise a SystemC design to minimise area when targeting FPGAs can significantly improve the synthesis of a high-level language construct, thus allowing a designer to concentrate more on an algorithm description and less on hardware-specific implementation details.

...read moreread less

Abstract: This paper discusses a pair of synthesis algorithms that optimise a SystemC design to minimise area when targeting FPGAs. Each can significantly improve the synthesis of a high-level language construct, thus allowing a designer to concentrate more on an algorithm description and less on hardware-specific implementation details. The first algorithm is a source-level transformation implementing function exlining--where a separate block of hardware implements a function and is shared between multiple calls to the function. The second is a novel algorithm for mapping arrays to memories which involves assigning array accesses to memory ports such that no port is ever accessed more than once in a clock cycle. This algorithm assigns accesses to read/write only ports and read-write ports concurrently, solving the assignment problem more efficiently for a wider range of memories compared to existing methods. Both optimisations operate on a high-level program representation and have been implemented in a commercial SystemC compiler. Experiments show that in suitable circumstances these techniques result in significant reductions in logic utilisation for FPGAs.

...read moreread less