Comparison of architectures of a coarse-grain reconfigurable multiply-accumulate unit

doi:10.1109/ICGCE.2013.6823433

Citations

PDF

Open Access

More filters

Journal Article•DOI•

RECON: Resource-Efficient CORDIC-Based Neuron Architecture

[...]

Gopal Raut¹, Shubham Rai², Santosh Kumar Vishvakarma¹, Akash Kumar²•Institutions (2)

Indian Institute of Technology Indore¹, Dresden University of Technology²

25 Jan 2021

TL;DR: In this article, the authors propose a resource-efficient Co-ordinate Rotation Digital Computer (CORDIC)-based neuron architecture (RECON) which can be configured to compute both multiply-accumulate (MAC) and non-linear activation function (AF) operations.

...read moreread less

Abstract: Contemporary hardware implementations of artificial neural networks face the burden of excess area requirement due to resource-intensive elements such as multiplier and non-linear activation functions. The present work addresses this challenge by proposing a resource-efficient Co-ordinate Rotation Digital Computer (CORDIC)-based neuron architecture (RECON) which can be configured to compute both multiply-accumulate (MAC) and non-linear activation function (AF) operations. The CORDIC-based architecture uses linear and trigonometric relationships to realize MAC and AF operations respectively. The proposed design is synthesized and verified at 45nm technology using Cadence Virtuoso for all physical parameters. Implementation of the signed fixed-point 8-bit MAC using our design, shows 60% less area, latency, and power product (ALP) and shows improvement by 38% in area, 27% in power dissipation, and 15% in latency with respect to the state-of-the-art MAC design. Further, Monte-Carlo simulations for process-variations and device-mismatch are performed for both the proposed model and the state-of-the-art to evaluate expectations of functions of randomness in dynamic power variation. The dynamic power variation for our design shows that worst-case mean is $189.73\mu W$ which is 63% of the state-of-the-art.

...read moreread less

20 citations

Journal Article•DOI•

Low Power Resource Efficient CORDIC Enabled Neuron Architecture using 45nm CMOS Technology

[...]

Gopal Raut, Vikas Maheshwari, Rajib Kumar Kar

01 Apr 2023-e-Prime

References

PDF

Open Access

More filters

Posted Content•

Resource Sharing and Pipelining in Coarse-Grained Reconfigurable Architecture for Domain-Specific Optimization

[...]

Yoonjin Kim¹, Mary Kiemb¹, Chulsoo Park¹, Jinyong Jung¹, Kiyoung Choi¹ - Show less +1 more•Institutions (1)

Seoul National University¹

25 Oct 2007-arXiv: Hardware Architecture

TL;DR: In this article, the authors proposed a reconfigurable array architecture template and design space exploration flow for domain-specific optimization, which can reduce the hardware cost and the delay without any performance degradation for some application domains.

...read moreread less

Abstract: Coarse-grained reconfigurable architectures aim to achieve both goals of high performance and flexibility. However, existing reconfigurable array architectures require many resources without considering the specific application domain. Functional resources that take long latency and/or large area can be pipelined and/or shared among the processing elements. Therefore the hardware cost and the delay can be effectively reduced without any performance degradation for some application domains. We suggest such reconfigurable array architecture template and design space exploration flow for domain-specific optimization. Experimental results show that our approach is much more efficient both in performance and area compared to existing reconfigurable architectures.

...read moreread less

91 citations

Journal Article•DOI•

Pipelined adders

[...]

L. Dadda¹, Vincenzo Piuri¹•Institutions (1)

Polytechnic University of Milan¹

01 Mar 1996-IEEE Transactions on Computers

TL;DR: This paper shows that other schemes can be designed, based on the idea of pipelining a serial-input adder or a ripple-carry adder, to obtain pipelined adders for more than two numbers.

...read moreread less

Abstract: A well-known scheme for obtaining high throughput adders is a pipeline in which each stage contains an array of half-adders performing a carry-save addition. This paper shows that other schemes can be designed, based on the idea of pipelining a serial-input adder or a ripple-carry adder. Such schemes offer a considerable savings of components while preserving high throughput. These schemes can be generalized by using (p,q) parallel counters to obtain pipelined adders for more than two numbers.

...read moreread less

48 citations

Journal Article•DOI•

Hardware Resource Virtualization for Dynamically Partially Reconfigurable Systems

[...]

Chun-Hsian Huang¹, Pao-Ann Hsiung¹•Institutions (1)

National Chung Cheng University¹

01 May 2009-IEEE Embedded Systems Letters

TL;DR: A virtual hardware mechanism, including the logic virtualization and the hardware device virtualization, is proposed, for dynamically partially reconfigurable systems, which can reduce up to 26% of the time required by using the conventional hardware reuse.

...read moreread less

Abstract: The dynamic partial reconfiguration technology enables an embedded system to adapt its hardware functionalities at run-time to changing environment conditions. However, reconfigurable hardware functions are still managed as conventional hardware devices, and the enhancement of system performance using the partial reconfiguration technology is thus still limited. To further raise the utilization of reconfigurable hardware designs, we propose a virtual hardware mechanism, including the logic virtualization and the hardware device virtualization, for dynamically partially reconfigurable systems. Using the logic virtualization technique, a hardware function that has been configured in the field-programmable gate array (FPGA) can be virtualized to support more than one software application at run-time. Using the hardware device virtualization, a software application can access two or more different hardware functions through the same device node. In a network security reconfigurable system for multimedia applications, our experimental results also demonstrate that the utilization of reconfigurable hardware functions can be further raised using the virtual hardware mechanism. Furthermore, the virtual hardware mechanism can also reduce up to 26% of the time required by using the conventional hardware reuse.

...read moreread less

39 citations

Proceedings Article•DOI•

Multi-operand Floating-Point Addition

[...]

A.F. Tenca¹•Institutions (1)

Synopsys¹

08 Jun 2009

TL;DR: The design of a component to perform parallel addition of multiple floating-point (FP) operands is explored and the proposed design is more accurate than conventional FP addition using a network of 2-operand FP adders and it may have competitive area and delay depending on the number of input operands.

...read moreread less

Abstract: The design of a component to perform parallel addition of multiple floating-point (FP) operands is explored in this work. In particular, a 3-input FP adder is discussed in more detail, but the main concepts and ideas presented in this work are valid for FP adders with more inputs. The proposed design is more accurate than conventional FP addition using a network of 2-operand FP adders and it may have competitive area and delay depending on the number of input operands. Implementation results of a 3-operand FP adder are presented to compare its performance to a network of 2-input FP adders.

...read moreread less

35 citations

Proceedings Article•DOI•

Performance Analysis of Fast Adders Using VHDL

[...]

R. P. Singh, Parveen Kumar, Balwinder Singh¹•Institutions (1)

Centre for Development of Advanced Computing¹

27 Oct 2009

TL;DR: The modified carry skip adders presented in this paper provides better speed and power consumption as compare to conventional carryskip adder and other adders like ripple carry adder, carry lookahead adders, Ling adder), carry select adder.

...read moreread less

Abstract: This paper presents performance analysis of different Fast Adders. The comparison is done on the basis of three performance parameters i.e. Area, Speed and Power consumption. Further, we present a design methodology of hybrid carry lookahead/carry skip adders (CLSKAs). This modified carry skip adder is modeled by using both fix and variable block size. In conventional carry skip adder, each block consists of ripple carry adder and skip logic is used after each block to generate carry for next block. The speed of operation depends on carry propagation from previous block to next block. In CLSKAs, we use carry lookahead logic in each block to generate carry for next block. The modified carry skip adders presented in this paper provides better speed and power consumption as compare to conventional carry skip adder and other adders like ripple carry adder, carry lookahead adder, Ling adder, carry select adder. The modified carry skip adders with fix block require few more CLB’s because of Carry lookahead logic, whereas with variable block scheme, area optimization is achieved.

...read moreread less

34 citations

Comparison of architectures of a coarse-grain reconfigurable multiply-accumulate unit

Citations

References

Related Papers (5)