scispace - formally typeset
Proceedings ArticleDOI

High level synthesis: Where are we? A case study on matrix multiplication

TLDR
This paper investigates matrix multiplication using a standard algorithm, Strassen algorithm, and a sparse algorithm to provide a comprehensive analysis of the capabilities and usability of the Xilinx Vivado HLS tool.
Abstract
One of the pitfalls of FPGA design is the relatively long implementation time when compared to alternative architectures, such as CPU, GPU or DSP. This time can be greatly reduced however by using tools that can generate hardware systems in the form of a hardware description language (HDL) from high-level languages such as C, C++, or Python. Such implementations can be optimized by applying special directives that focus the high-level synthesis (HLS) effort on particular objectives, such as performance, area, throughput, or power consumption. In this paper we examine the benefits of this approach by comparing the performance and design times of HLS generated systems versus custom systems for matrix multiplication. We investigate matrix multiplication using a standard algorithm, Strassen algorithm, and a sparse algorithm to provide a comprehensive analysis of the capabilities and usability of the Xilinx Vivado HLS tool. In our experience, a hardware-oriented electrical engineering student can achieve up to 61% of the performance of custom designs with 1/3 the effort, thus enabling faster hardware acceleration of many compute-bound algorithms.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Are We There Yet? A Study on the State of High-Level Synthesis

TL;DR: HLS is currently a viable option for fast prototyping and for designs with short time to market and to help close the QoR gap, a survey of literature focused on improving HLS concludes.
Proceedings ArticleDOI

High Level Synthesis of Complex Applications: An H.264 Video Decoder

TL;DR: A case study using HLS for a full H.264 decoder for an application with over 6000 lines of code and over 100 functions, and the experience on code conversion for synthesizability, various HLS optimizations, HLS limitations while dealing with complex input code, and general design insights are shared.
Journal ArticleDOI

A Hybrid FPGA-Based System for EEG- and EMG-Based Online Movement Prediction

TL;DR: A novel Field Programmable Gate Array (FPGA) -based system for real-time movement prediction using physiological data that achieves a classification accuracy similar to systems with double precision floating-point precision.
Proceedings ArticleDOI

Parallel matrix multiplication on memristor-based computation-in-memory architecture

TL;DR: A communication-efficient mapping of a large-scale matrix multiplication algorithm on the Computation-in-Memory architecture is presented, which shows that, depending on the matrix size, CIM architecture exhibits several orders of magnitude higher performance in total execution time and two order of magnitude better in total energy consumption than the multicore-based on the shared memory architecture.
Book ChapterDOI

Method to Analyze the Susceptibility of HLS Designs in SRAM-Based FPGAs Under Soft Errors

TL;DR: This paper analyzes four different design architectures implemented in a 28i?źnm SRAM-based FPGA under fault injection to analyze the probability of errors and the proposed characterization method can be used to guide designers to select the most efficient architecture concerning the susceptibility to upsets and performance efficiency.
References
More filters
Journal ArticleDOI

High-Level Synthesis for FPGAs: From Prototyping to Deployment

TL;DR: AutoESL's AutoPilot HLS tool coupled with domain-specific system-level implementation platforms developed by Xilinx are used as an example to demonstrate the effectiveness of state-of-art C-to-FPGA synthesis solutions targeting multiple application domains.
Proceedings ArticleDOI

LegUp: high-level synthesis for FPGA-based processor/accelerator systems

TL;DR: A new open source high-level synthesis tool called LegUp that allows software techniques to be used for hardware design and produces hardware solutions of comparable quality to a commercial high- level synthesis tool.
Book

Practical FPGA programming in C

TL;DR: C-based techniques for building high-performance, FPGA-accelerated software applicationsCircuits, Devices, and Systems C-based Techniques for Optimizing FPGAs Performance, Design Flexibility, and Time to Market forward is introduced.
Proceedings ArticleDOI

Handel-C for rapid prototyping of VLSI coprocessors for real time systems

TL;DR: This paper investigates the effectiveness of using Handel-C, in an academic setting, to develop real time embedded systems in environments that incorporates the reconfigurable FPGA based co-processor logic.
Proceedings ArticleDOI

Impulse C vs. VHDL for Accelerating Tomographic Reconstruction

TL;DR: It is demonstrated that Impulse C designs can achieve over 61x improvement over multi-threaded software (8 threads), and close to the same performance as VHDL, while significantly reducing the design effort, and that tightly-coupled FPGA coprocessors like the XD1000 effectively overcomes the traditional communication bottleneck between CPU and FPGa.
Related Papers (5)