scispace - formally typeset
Search or ask a question
Proceedings Article

Bringing C++ productivity to VHDL world: From language definition to a case study

TL;DR: This paper presents a hardware description language based on the VHDL semantics — THDL++, which supports the extended generic concept and improves it further by supporting compile-time lists with “for each” semantics, inheritance, expression type derivation and late binding.
Abstract: During the last years the hardware description languages evolved providing a faster and a more generic way of describing synthesizable hardware architectures. E.g., VHDL 2008 extended the concept of generics from integral numbers to types, packages and subroutines. This paper presents a hardware description language based on the VHDL semantics — THDL++. It supports the extended generic concept and improves it further by supporting compile-time lists with “for each” semantics, inheritance, expression type derivation and late binding. We also present THDL++ compiler with 2 back-ends: a synthesizable VHDL-87 back-end makes it easy to integrate THDL++ into any existing design flow, and a C++ back-end that generates a cycle-accurate model for fast simulation. We illustrate how using THDL++ significantly reduces design effort compared to raw VHDL by making the code more readable and reusable. As a case study, we present a hardware LZSS (ZIP) compressor, targeting Xilinx FPGAs that's development was accelerated by using THDL++. We demonstrate how using THDL++ reduced the amount of code lines by a factor of 1.85 compared to VHDL and how using the C++ back-end increased simulation performance by a factor of 8 compared to ModelSim [1]. The THDL++ compiler and an IDE integrated with Xilinx toolchain is available online [2].
Citations
More filters
Proceedings ArticleDOI
30 Nov 2011
TL;DR: This paper presents the first FPGA based accelerator for option pricing with the state-of-the-art Heston model based on advanced Monte Carlo simulations and expects to achieve the same simulation speed as a Nvidia Tesla C2050 GPU, by consuming less than 3% of the energy at the same time.
Abstract: Today, pricing of derivates (particularly options) in financial institutions is a challenge. Besides the increasing complexity of the products, obtaining fair prices requires more realistic (and therefore complex) models of the underlying asset behavior. Not only due to the increasing costs, energy efficient and accurate pricing of these models becomes more and more important. In this paper we present - to the best of our knowledge - the first FPGA based accelerator for option pricing with the state-of-the-art Heston model. It is based on advanced Monte Carlo simulations. Compared to an 8-core Intel Xeon Server running at 3.07GHz, our hybrid FPGA-CPU-system saves 89% of the energy and provides around twice the speed. The same system reduces the energy consumption per simulation to around 40% of a fully-loaded Nvidia Tesla C2050 GPU. For a three-Virtex-5 chip only accelerator, we expect to achieve the same simulation speed as a Nvidia Tesla C2050 GPU, by consuming less than 3% of the energy at the same time.

55 citations


Cites methods from "Bringing C++ productivity to VHDL w..."

  • ...For the implementation, we have used our VisualHDL methodology [18]....

    [...]

Proceedings ArticleDOI
21 May 2012
TL;DR: This paper presents a flexible high-performance implementation of the LZSS compression algorithm capable of processing up to 50 MB/s on a Virtex-5 FPGA chip and provides a cycle-accurate estimation tool that allows finding a trade-off between FPGa resource utilization, compression ratio and performance for a specific data sample.
Abstract: The increasing growth of embedded networking applications has created a demand for high-performance logging systems capable of storing huge amounts of high-bandwidth, typically redundant data. An efficient way of maximizing the logger performance is doing a real-time compression of the logged stream. In this paper we present a flexible high-performance implementation of the LZSS compression algorithm capable of processing up to 50 MB/s on a Virtex-5 FPGA chip. We exploit the independently addressable dual-port block RAMs inside the FPGA chip to achieve an average performance of 2 clock cycles per byte. To make the compressed stream compatible with the ZLib library we encode the LZSS algorithm output using a fixed Huffman table defined by the Deflate specification. We also demonstrate how changing the amount of memory allocated to various internal tables impacts the performance and compression ratio. Finally, we provide a cycle-accurate estimation tool that allows finding a trade-off between FPGA resource utilization, compression ratio and performance for a specific data sample.

13 citations

Proceedings ArticleDOI
10 Apr 2012
TL;DR: A modification of the Adaptive Range Coding algorithm used by 7-Zip compressor implemented in an Field Programmable Gate Array (FPGA) to support massive parallelization that allows making use of the distributed FPGA logic and achieving compression throughput of more than 50MB/s when implemented on a Virtex5 FPGa in conjunction with a hardware LZSS coder.
Abstract: Loss less compression algorithms are employed in a wide variety of communication- and storage-related systems. Many embedded applications, such as real-time communication log compression used in automotive systems, impose strict throughput constraints on the compression unit, creating a demand for hardware-accelerated designs. In this paper we present a modification of the Adaptive Range Coding algorithm used by 7-Zip compressor implemented in an Field Programmable Gate Array (FPGA). We have improved the algorithm to support massive parallelization that allows making use of the distributed FPGA logic and achieving compression throughput of more than 50MB/s when implemented on a Virtex5 FPGA in conjunction with a hardware LZSS coder. Compared to a fixed-table Huffman encoder, our implementation provides the same high throughput and a 20% better compression ratio. Furthermore we explore several variations of algorithm parameters and show various trade-offs between compression efficiency, FPGA utilization and throughput.

10 citations

Proceedings ArticleDOI
09 Jul 2012
TL;DR: The RIVER architecture is a run-time configurable and programmable fabric for parallel stream processing on FPGAs that achieves higher clock speeds and offer additional, non-trivial features thanks to the sophisticated memory architecture.
Abstract: The RIVER architecture is a run-time configurable and programmable fabric for parallel stream processing on FPGAs. RIVER's memory architecture has been designed to support non-trivial data flows efficiently and in real-time. The individual data processing cores are called Dynamic Streaming Engines (DSE). Our cloud computing supported design flow generates hundreds of thousands different DSE cores ahead in time. At run-time users may download pre-synthesized DSE cores according to their requirements — for example area and power consumption. Furthermore, our design flow shields users from traditional HDL design flows and tedious design optimizations. However, we do not impede architectural changes but provide support for them through custom instructions and numerous design time options. Our results suggest that our architecture performs well for computational- and memory-intensive kernels such as 2-dimensional convolution. By comparison to recently published, highly specialized architectures we achieve higher clock speeds and offer additional, non-trivial features thanks to our sophisticated memory architecture.

3 citations


Cites background from "Bringing C++ productivity to VHDL w..."

  • ...More recent HDLs such as Chisel [7], THDL [8], Bluespec [9–11] and ROCCC [12] aim to increase productivity....

    [...]

Proceedings ArticleDOI
01 Dec 2013
TL;DR: This work presents an expandable and modular HDL (modHDL), developed by applying the Language Oriented Programming paradigm, which consists of a set of modular languages, each with its own scope but with a generic interface to combine or extend for specific requirements.
Abstract: With each product generation the number of application domains FPGAs are suitable for is increasing. However, the rising complexity of devices and applications induce the need for more abstract and efficient programming languages. Many research projects have already addressed the improvement of hardware description languages (HDLs). Since each resulting work is tied to its own framework, an extension or combination of such to adopt for a specific application is hardly possible. In this work an expandable and modular HDL (modHDL) is presented which has been developed by applying the Language Oriented Programming paradigm. modHDL consists of a set of modular languages, each with its own scope but with a generic interface to combine or extend for specific requirements. We will discuss the base language with three extension languages covering their own application domain. Furthermore, the utilization process of these languages for further extensions is analyzed in order to demonstrate the convenience of modHDL for high-level hardware design.

2 citations


Cites methods from "Bringing C++ productivity to VHDL w..."

  • ...THDL++ [6] combines the syntax and the basic object oriented approach of C++ and the concurrent semantics of VHDL....

    [...]

References
More filters
Book
01 Dec 2006
TL;DR: Detailed descriptions and explanations of the most well-known and frequently used compression methods are covered in a self-contained fashion, with an accessible style and technical level for specialists and nonspecialists.
Abstract: Data compression is one of the most important fields and tools in modern computing. From archiving data, to CD ROMs, and from coding theory to image analysis, many facets of modern computing rely upon data compression. Data Compression provides a comprehensive reference for the many different types and methods of compression. Included are a detailed and helpful taxonomy, analysis of most common methods, and discussions on the use and comparative benefits of methods and description of "how to" use them. The presentation is organized into the main branches of the field of data compression: run length encoding, statistical methods, dictionary-based methods, image compression, audio compression, and video compression. Detailed descriptions and explanations of the most well-known and frequently used compression methods are covered in a self-contained fashion, with an accessible style and technical level for specialists and nonspecialists. Topics and features: coverage of video compression, including MPEG-1 and H.261 thorough coverage of wavelets methods, including CWT, DWT, EZW and the new Lifting Scheme technique complete audio compression QM coder used in JPEG and JBIG, including new JPEG 200 standard image transformations and detailed coverage of discrete cosine transform and Haar transform coverage of EIDAC method for compressing simple images prefix image compression ACB and FHM curve compression geometric compression and edgebreaker technique.Data Compression provides an invaluable reference and guide for all computer scientists, computer engineers, electrical engineers, signal/image processing engineers and other scientists needing a comprehensive compilation for a broad range of compression methods.

1,745 citations


"Bringing C++ productivity to VHDL w..." refers background or methods in this paper

  • ...Additionally, we compare 2 implementations of the fixed-table Huffman coder used by Deflate algorithm [16]....

    [...]

  • ...The main reason why manually coded design was faster than AutoPilot-generated was the data structure: most of the input samples (produced by LZSS algorithm while compressing real-world data) require only 50% of computation and thus can be manually scheduled into 1 clock cycle....

    [...]

  • ...This section compares THDL++ with raw VHDL using two real-world examples: OpenAVR [23], an open-source AVRcompatible [24] processor written in THDL++ and a FPGAbased Deflate [16] compressor that combines LZSS algorithm and a fixed-table Huffman coder....

    [...]

  • ...We have evaluated the generated C++ model performance by simulating compression of a 10-megabyte file using a hardware LZSS compressor and compared the simulation performance to ModelSim....

    [...]

  • ...A fixed-table Huffman coder is a unit that maps fixed-size input words into variablesize output words using a fixed table (computed using Huffman algorithm [16] based on input probabilities)....

    [...]

Journal ArticleDOI
Jason Cong, Bin Liu, Stephen Neuendorffer1, Juanjo Noguera1, Kees Vissers1, Zhiru Zhang 
TL;DR: AutoESL's AutoPilot HLS tool coupled with domain-specific system-level implementation platforms developed by Xilinx are used as an example to demonstrate the effectiveness of state-of-art C-to-FPGA synthesis solutions targeting multiple application domains.
Abstract: Escalating system-on-chip design complexity is pushing the design community to raise the level of abstraction beyond register transfer level. Despite the unsuccessful adoptions of early generations of commercial high-level synthesis (HLS) systems, we believe that the tipping point for transitioning to HLS msystem-on-chip design complexityethodology is happening now, especially for field-programmable gate array (FPGA) designs. The latest generation of HLS tools has made significant progress in providing wide language coverage and robust compilation technology, platform-based modeling, advancement in core HLS algorithms, and a domain-specific approach. In this paper, we use AutoESL's AutoPilot HLS tool coupled with domain-specific system-level implementation platforms developed by Xilinx as an example to demonstrate the effectiveness of state-of-art C-to-FPGA synthesis solutions targeting multiple application domains. Complex industrial designs targeting Xilinx FPGAs are also presented as case studies, including comparison of HLS solutions versus optimized manual designs. In particular, the experiment on a sphere decoder shows that the HLS solution can achieve an 11-31% reduction in FPGA resource usage with improved design productivity compared to hand-coded design.

728 citations


"Bringing C++ productivity to VHDL w..." refers methods in this paper

  • ...To compare THDL++ with another productivity-oriented design tools, we used AutoPilot [19] to synthesize a fixedtable Huffman coder on a Virtex5 family FPGA and compared results with manually coded THDL++....

    [...]

  • ...One implementation is done using THDL++, another is done in C++ and synthesized using AutoPilot [19] – a commercial high-level synthesis tool....

    [...]

  • ...This family of tools (such as CatapultC [18] and AutoPilot [19]) allows using C/C++ languages to describe the hardware....

    [...]

Book
01 Apr 2001
TL;DR: This book introduces the concept of generic components, which enable an easier and more seamless transition from design to application code, generate code that better expresses the original design intention, and support the reuse of design structures with minimal recoding within C++.
Abstract: Modern C++ Designis an important book. Fundamentally, it demonstrates 'generic patterns' or 'pattern templates' as a powerful new way of creating extensible designs in C++i??a new way to combine templates and patterns that you may never have dreamt was possible, but is. If your work involves C++ design and coding, you should read this book. Highly recommended. i??Herb SutterWhat's left to say about C++ that hasn't already been said? Plenty, it turns out. i??From the Foreword by John VlissidesIn Modern C++ Design, Andrei Alexandrescu opens new vistas for C++ programmers. Displaying extraordinary creativity and programming virtuosity, Alexandrescu offers a cutting-edge approach to design that unites design patterns, generic programming, and C++, enabling programmers to achieve expressive, flexible, and highly reusable code.This book introduces the concept of generic componentsi??reusable design templates that produce boilerplate code for compiler consumptioni??all within C++. Generic components enable an easier and more seamless transition from design to application code, generate code that better expresses the original design intention, and support the reuse of design structures with minimal recoding.The author describes the specific C++ techniques and features that are used in building generic components and goes on to implement industrial strength generic components for real-world applications. Recurring issues that C++ developers face in their day-to-day activity are discussed in depth and implemented in a generic way. These include: Policy-based design for flexibility Partial template specialization Typelistsi??powerful type manipulation structures Patterns such as Visitor, Singleton, Command, and Factories Multi-method enginesFor each generic component, the book presents the fundamental problems and design options, and finally implements a generic solution.In addition, an accompanying Web site, http://www.awl.com/cseng/titles/0-201-70431-5, makes the code implementations available for the generic components in the book and provides a free, downloadable C++ library, called Loki, created by the author. Loki provides out-of-the-box functionality for virtually any C++ project.Get a value-added service! Try out all the examples from this book at www.codesaw.com. CodeSaw is a free online learning tool that allows you to experiment with live code from your book right in your browser. 0201704315B11102003

568 citations


"Bringing C++ productivity to VHDL w..." refers background or methods in this paper

  • ...Thus, the generics are called template arguments and packages are called policy classes [13]....

    [...]

  • ...This approach is well known in software engineering as policy class-based design and is described by Alexandrescu in [13]....

    [...]

Proceedings ArticleDOI
04 Nov 2001
TL;DR: A retargetable framework for ASIP design which is based on machine descriptions in the LISA language is presented which can be automatically generated including HLL C-compiler, assembler, linker, simulator and debugger frontend and synthesizable HDL code can be derived.
Abstract: The development of application specific instruction set processors (ASIP) is currently the exclusive domain of the semiconductor houses and core vendors. This is due to the fact that building such an architecture is a difficult task that requires expertise knowledge in different domains: application software development tools, processor hardware implementation, and system integration and verification. This paper presents a retargetable framework for ASIP design which is based on machine descriptions in the LISA language. From that, software development tools can be automatically generated including HLL C-compiler, assembler, linker, simulator and debugger frontend. Moreover, synthesizable HDL code can be derived which can then be processed by standard synthesis tools. Implementation results for a low-power ASIP for DVB-T acquisition and tracking algorithms designed with the presented methodology will be given.

97 citations

Journal ArticleDOI
TL;DR: O-VHDL, the object-oriented extension of V HDL, supports the VHDL computation model and the reactive computation model within the same system and is implemented with a preprocessor and debugging tool to help modelers develop a smooth transition from abstract models to detailed hardware models.
Abstract: Object-oriented extensions to hardware description languages let engineers model systems at a higher level of abstraction, thus helping them manage design complexity and maximize component reuse. OO-VHDL, the object-oriented extension of VHDL that we describe in this article, supports the VHDL computation model and the reactive computation model within the same system. We hope this will help modelers develop a smooth transition from abstract models to detailed hardware models. In implementing the extensions, we were guided by one goal: providing the language to modelers as quickly as possible. This meant OO-VHDL had to be usable in current VHDL simulation environments. Thus, we have implemented a preprocessor that translates OO-VHDL to VHDL and a debugging tool that maps VHDL statements into the OO-VHDL statements from which they were derived. >

78 citations


"Bringing C++ productivity to VHDL w..." refers background in this paper

  • ...Among those are [3], object-oriented extensions to VHDL ([4], [5], [6], [7]), a work on extending the SystemC synthesis subset by object-oriented features [8], and the latest VHDL2008 standard [9], that introduced generic packages, functions and types [10]....

    [...]