scispace - formally typeset
Proceedings ArticleDOI

Closing the gap: CPU and FPGA trends in sustainable floating-point BLAS performance

TLDR
The analysis highlights the amount of memory bandwidth and internal storage needed to sustain peak performance with FPGAs and considers the historical context of the last six years and is extrapolated for the next six years.

Content maybe subject to copyright    Report

Citations
More filters
Book

Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation

Scott Hauck, +1 more
TL;DR: This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology.
Proceedings ArticleDOI

A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications

TL;DR: This paper analyzes an important domain of applications, referred to as sliding-window applications, when executing on FPGAs, GPUs, and multicores, and presents optimization strategies and use cases where each device is most effective.
Proceedings ArticleDOI

Sparse Matrix-Vector multiplication on FPGAs

TL;DR: Besides solving SpMXV problem, the design provides a parameterized and flexible tree-based design for floating-point applications on FPGAs, which demonstrates significant speedup over general-purpose processors particularly for matrices with very irregular sparsity structure.
Proceedings ArticleDOI

64-bit floating-point FPGA matrix multiplication

TL;DR: A 64-bit ANSI/IEEE Std 754-1985 floating point design of a hardware matrix multiplier optimized for FPGA implementations and implement a scalable linear array of processing elements (PE) supporting the proposed algorithm in the Xilinx Virtex II Pro technology.
Journal ArticleDOI

Reconfigurable Computing Architectures

TL;DR: This work surveys the field of reconfigurable computing, providing a guide to the body-of-knowledge accumulated in architecture, compute models, tools, run-time reconfiguration, and applications.
References
More filters
Journal ArticleDOI

New trends in high performance computing

TL;DR: The automatically tuned linear algebra software (ATLAS) project is described, as well as the fundamental principles that underly it, with the present emphasis on the basic linear algebra subprograms (BLAS), a widely used, performance-critical, linear algebra kernel library.

Automated Empirical Optimizations of Software and the ATLAS Project (LAPACK Working Note 147)

TL;DR: This paper describes the ATLAS (Automatically Tuned Linear Algebra Software) project, as well as the fundamental principles that underly it, with the present emphasis on the Basic Linear Al algebra Subprograms (BLAS), a widely used, performance-critical, linear algebra kernel library.
Proceedings ArticleDOI

FPGAs vs. CPUs: trends in peak floating-point performance

TL;DR: This paper examines the impact of Moore's Law on the peak floating-point performance of FPGAs and results show that peak FPGA floating- point performance is growing significantly faster than peak CPU performance for a CPU.
Proceedings ArticleDOI

Quantitative analysis of floating point arithmetic on FPGA based custom computing machines

TL;DR: Using higher-level languages, like VHDL, facilitates the development of custom operators without significantly impacting operator performance or area, as well as properties, including area consumption and speed of working arithmetic operator units used in real-time applications.