scispace - formally typeset
Search or ask a question
Topic

Reconfigurable computing

About: Reconfigurable computing is a research topic. Over the lifetime, 7854 publications have been published within this topic receiving 126057 citations.


Papers
More filters
Proceedings ArticleDOI
22 Feb 2017
TL;DR: The design of a BNN accelerator is presented that is synthesized from C++ to FPGA-targeted Verilog and outperforms existing FPGAs-based CNN accelerators in GOPS as well as energy and resource efficiency.
Abstract: Convolutional neural networks (CNN) are the current stateof-the-art for many computer vision tasks. CNNs outperform older methods in accuracy, but require vast amounts of computation and memory. As a result, existing CNN applications are typically run on clusters of CPUs or GPUs. Studies into the FPGA acceleration of CNN workloads has achieved reductions in power and energy consumption. However, large GPUs outperform modern FPGAs in throughput, and the existence of compatible deep learning frameworks give GPUs a significant advantage in programmability. Recent research in machine learning demonstrates the potential of very low precision CNNs -- i.e., CNNs with binarized weights and activations. Such binarized neural networks (BNNs) appear well suited for FPGA implementation, as their dominant computations are bitwise logic operations and their memory requirements are reduced. A combination of low-precision networks and high-level design methodology may help address the performance and productivity gap between FPGAs and GPUs. In this paper, we present the design of a BNN accelerator that is synthesized from C++ to FPGA-targeted Verilog. The accelerator outperforms existing FPGA-based CNN accelerators in GOPS as well as energy and resource efficiency.

379 citations

Journal ArticleDOI
Weikang Qian1, Xin Li1, Marc D. Riedel1, Kia Bazargan1, David J. Lilja1 
TL;DR: The concept of stochastic logic is applied to a reconfigurable architecture that implements processing operations on a datapath and it is found to be much more tolerant of soft errors than conventional hardware implementations.
Abstract: Mounting concerns over variability, defects, and noise motivate a new approach for digital circuitry: stochastic logic, that is to say, logic that operates on probabilistic signals and so can cope with errors and uncertainty. Techniques for probabilistic analysis of circuits and systems are well established. We advocate a strategy for synthesis. In prior work, we described a methodology for synthesizing stochastic logic, that is to say logic that operates on probabilistic bit streams. In this paper, we apply the concept of stochastic logic to a reconfigurable architecture that implements processing operations on a datapath. We analyze cost as well as the sources of error: approximation, quantization, and random fluctuations. We study the effectiveness of the architecture on a collection of benchmarks for image processing. The stochastic architecture requires less area than conventional hardware implementations. Moreover, it is much more tolerant of soft errors (bit flips) than these deterministic implementations. This fault tolerance scales gracefully to very large numbers of errors.

367 citations

Journal ArticleDOI
TL;DR: This work exhibits a dozen applications where PAM technology proves superior, both in performance and cost, to every other existing technology, including supercomputers, massively parallel machines, and conventional custom hardware.
Abstract: Programmable active memories (PAM) are a novel form of universal reconfigurable hardware coprocessor. Based on field-programmable gate array (FPGA) technology, a PAM is a virtual machine, controlled by a standard microprocessor, which can be dynamically and indefinitely reconfigured into a large number of application-specific circuits. PAM's offer a new mixture of hardware performance and software versatility. We review the important architectural features of PAM's, through the example of DECPeRLe-1, an experimental device built in 1992. PAM programming is presented, in contrast to classical gate-array and full custom circuit design. Our emphasis is on large, code-generated synchronous systems descriptions; no compromise is made with regard to the performance of the target circuits. We exhibit a dozen applications where PAM technology proves superior, both in performance and cost, to every other existing technology, including supercomputers, massively parallel machines, and conventional custom hardware. The fields covered include computer arithmetic, cryptography, error correction, image analysis, stereo vision, video compression, sound synthesis, neural networks, high-energy physics, thermodynamics, biology and astronomy. At comparable cost, the computing power virtually available in a PAM exceeds that of conventional processors by a factor 10 to 1000, depending on the specific application, in 1992. A technology shrink increases the performance gap between conventional processors and PAM's. By Noyce's law, we predict by how much the performance gap will widen with time.

359 citations

Journal ArticleDOI
TL;DR: This article presents fast online placement methods for dynamically reconfigurable systems, as well as offline 3D placement algorithms for statically reconfigured architectures.
Abstract: This article presents fast online placement methods for dynamically reconfigurable systems, as well as offline 3D placement algorithms for statically reconfigurable architectures.

346 citations

Patent
11 Dec 1992
TL;DR: In this article, an integrated circuit computing device is comprised of a dynamically configurable Field Programmable Gate Array (FPGA), which is configured to implement a RISC processor and a Reconfigurable Instruction Execution Unit.
Abstract: An integrated circuit computing device is comprised of a dynamically configurable Field Programmable Gate Array (FPGA). This gate array is configured to implement a RISC processor and a Reconfigurable Instruction Execution Unit. Since the FPGA can be dynamically reconfigured, the Reconfigurable Instruction Execution Unit can be dynamically changed to implement complex operations in hardware rather than in time-consuming software routines. This feature allows the computing device to operate at speeds that are orders of magnitude greater than traditional RISC or CISC counterparts. In addition, the programmability of the computing device makes it very flexible and hence, ideally suited to handle a large number of very complex and different applications.

346 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
85% related
CMOS
81.3K papers, 1.1M citations
84% related
Scalability
50.9K papers, 931.6K citations
83% related
Data compression
43.6K papers, 756.5K citations
83% related
Compiler
26.3K papers, 578.5K citations
82% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202311
202257
2021110
2020158
2019168
2018178