scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Low-Power Digital Signal Processing Using Approximate Adders

TL;DR: This paper proposes logic complexity reduction at the transistor level as an alternative approach to take advantage of the relaxation of numerical accuracy, and demonstrates the utility of these approximate adders in two digital signal processing architectures with specific quality constraints.
Abstract: Low power is an imperative requirement for portable multimedia devices employing various signal processing algorithms and architectures. In most multimedia applications, human beings can gather useful information from slightly erroneous outputs. Therefore, we do not need to produce exactly correct numerical outputs. Previous research in this context exploits error resiliency primarily through voltage overscaling, utilizing algorithmic and architectural techniques to mitigate the resulting errors. In this paper, we propose logic complexity reduction at the transistor level as an alternative approach to take advantage of the relaxation of numerical accuracy. We demonstrate this concept by proposing various imprecise or approximate full adder cells with reduced complexity at the transistor level, and utilize them to design approximate multi-bit adders. In addition to the inherent reduction in switched capacitance, our techniques result in significantly shorter critical paths, enabling voltage scaling. We design architectures for video and image compression algorithms using the proposed approximate arithmetic units and evaluate them to demonstrate the efficacy of our approach. We also derive simple mathematical models for error and power consumption of these approximate adders. Furthermore, we demonstrate the utility of these approximate adders in two digital signal processing architectures (discrete cosine transform and finite impulse response filter) with specific quality constraints. Simulation results indicate up to 69% power savings using the proposed approximate adders, when compared to existing implementations using accurate adders.
Citations
More filters
Proceedings ArticleDOI
27 May 2013
TL;DR: This paper reviews recent progress in the area, including design of approximate arithmetic blocks, pertinent error and quality measures, and algorithm-level techniques for approximate computing.
Abstract: Approximate computing has recently emerged as a promising approach to energy-efficient design of digital systems. Approximate computing relies on the ability of many systems and applications to tolerate some loss of quality or optimality in the computed result. By relaxing the need for fully precise or completely deterministic operations, approximate computing techniques allow substantially improved energy efficiency. This paper reviews recent progress in the area, including design of approximate arithmetic blocks, pertinent error and quality measures, and algorithm-level techniques for approximate computing.

921 citations


Cites background from "Low-Power Digital Signal Processing..."

  • ...SC offers advantages such as hardware simplicity and fault tolerance [8]....

    [...]

Proceedings ArticleDOI
07 Jun 2015
TL;DR: A low-latency generic accuracy configurable adder to support variable approximation modes that provides a higher number of potential configurations compared to state-of-the-art, thus enabling a high degree of design flexibility and trade-off between performance and output quality.
Abstract: High performance approximate adders typically comprise of multiple smaller sub-adders, carry prediction units and error correction units In this paper, we present a low-latency generic accuracy configurable adder to support variable approximation modes It provides a higher number of potential configurations compared to state-of-the-art, thus enabling a high degree of design flexibility and trade-off between performance and output quality An error correction unit is integrated to provide accurate results for cases where high accuracy is required Furthermore, an associated scheme for error probability estimation allows convenient comparison of different approximate adder configurations without requiring the need to numerically simulate the adder Our experimental results validate the developed error model and also the lower latency of our generic accuracy configurable adder over state-of-the-art approximate adders For functional verification and prototyping, we have used a Xilinx Virtex-6 FPGA Our adder model and synthesizable RTL are made open-source

274 citations

Journal ArticleDOI
TL;DR: Synthesis results reveal that two proposed multipliers achieve power savings of 72% and 38%, respectively, compared to an exact multiplier, and have better precision when compared to existing approximate multipliers.
Abstract: Approximate computing can decrease the design complexity with an increase in performance and power efficiency for error resilient applications. This brief deals with a new design approach for approximation of multipliers. The partial products of the multiplier are altered to introduce varying probability terms. Logic complexity of approximation is varied for the accumulation of altered partial products based on their probability. The proposed approximation is utilized in two variants of 16-bit multipliers. Synthesis results reveal that two proposed multipliers achieve power savings of 72% and 38%, respectively, compared to an exact multiplier. They have better precision when compared to existing approximate multipliers. Mean relative error figures are as low as 7.6% and 0.02% for the proposed approximate multipliers, which are better than the previous works. Performance of the proposed multipliers is evaluated with an image processing application, where one of the proposed models achieves the highest peak signal to noise ratio.

236 citations


Cites background from "Low-Power Digital Signal Processing..."

  • ...In [1], approximate full adders are proposed at transistor level and they are utilized in digital signal processing applications....

    [...]

Proceedings ArticleDOI
02 Nov 2015
TL;DR: This paper designs a novel approximate multiplier to have an unbiased error distribution, which leads to lower computational errors in real applications because errors cancel each other out, rather than accumulate, as the multiplier is used repeatedly for a computation.
Abstract: Many applications for signal processing, computer vision and machine learning show an inherent tolerance to some computational error. This error resilience can be exploited to trade off accuracy for savings in power consumption and design area. Since multiplication is an essential arithmetic operation for these applications, in this paper we focus specifically on this operation and propose a novel approximate multiplier with a dynamic range selection scheme. We design the multiplier to have an unbiased error distribution, which leads to lower computational errors in real applications because errors cancel each other out, rather than accumulate, as the multiplier is used repeatedly for a computation. Our approximate multiplier design is also scalable, enabling designers to parameterize it depending on their accuracy and power targets. Furthermore, our multiplier benefits from a reduction in propagation delay, which enables its use on the critical path. We theoretically analyze the error of our design as a function of its parameters and evaluate its performance for a number of applications in image processing, and machine classification. We demonstrate that our design can achieve power savings of 54% -- 80%, while introducing bounded errors with a Gaussian distribution with near-zero average and standard deviations of 0.45% -- 3.61%. We also report power savings of up to 58% when using the proposed design in applications. We show that our design significantly outperforms other approximate multipliers recently proposed in the literature.

231 citations


Cites background from "Low-Power Digital Signal Processing..."

  • ...proposed several approximate adder designs that, by removing some of the logic used in a traditional mirror adder, achieved improved power, area, and performance [3]....

    [...]

Journal ArticleDOI
TL;DR: A review and classification are presented for the current designs of approximate arithmetic circuits including adders, multipliers, and dividers including improvements in delay, power, and area for the detection of differences in images by using approximate dividers.
Abstract: Often as the most important arithmetic modules in a processor, adders, multipliers, and dividers determine the performance and energy efficiency of many computing tasks. The demand of higher speed and power efficiency, as well as the feature of error resilience in many applications (e.g., multimedia, recognition, and data analytics), have driven the development of approximate arithmetic design. In this article, a review and classification are presented for the current designs of approximate arithmetic circuits including adders, multipliers, and dividers. A comprehensive and comparative evaluation of their error and circuit characteristics is performed for understanding the features of various designs. By using approximate multipliers and adders, the circuit for an image processing application consumes as little as 47% of the power and 36% of the power-delay product of an accurate design while achieving similar image processing quality. Improvements in delay, power, and area are obtained for the detection of differences in images by using approximate dividers.

197 citations


Cites background or methods from "Low-Power Digital Signal Processing..."

  • ...2010], the approximate designs of the mirror adder (AMAs) [Gupta et al. 2013], and the approximate XOR/XNOR-based full adders (AXAs) [Yang et al....

    [...]

  • ...Another method for reducing the critical path delay and power dissipation is by approximating a full adder [Mahdiani et al. 2010; Gupta et al. 2013; Yang et al. 2013; Cai et al. 2016; Angizi et al. 2017]....

    [...]

  • ...…the simple use of OR gates (and one AND gate for carry propagation) in the so-called lower-part-OR adder (LOA) (Figure 7) [Mahdiani et al. 2010], the approximate designs of the mirror adder (AMAs) [Gupta et al. 2013], and the approximate XOR/XNOR-based full adders (AXAs) [Yang et al. 2013]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The Baseline method has been by far the most widely implemented JPEG method to date, and is sufficient in its own right for a large number of applications.
Abstract: For the past few years, a joint ISO/CCITT committee known as JPEG (Joint Photographic Experts Group) has been working to establish the first international compression standard for continuous-tone still images, both grayscale and color. JPEG’s proposed standard aims to be generic, to support a wide variety of applications for continuous-tone images. To meet the differing needs of many applications, the JPEG standard includes two basic compression methods, each with various modes of operation. A DCT-based method is specified for “lossy’’ compression, and a predictive method for “lossless’’ compression. JPEG features a simple lossy technique known as the Baseline method, a subset of the other DCT-based modes of operation. The Baseline method has been by far the most widely implemented JPEG method to date, and is sufficient in its own right for a large number of applications. This article provides an overview of the JPEG standard, and focuses in detail on the Baseline method.

3,944 citations

Journal ArticleDOI
TL;DR: The author provides an overview of the JPEG standard, and focuses in detail on the Baseline method, which has been by far the most widely implemented JPEG method to date, and is sufficient in its own right for a large number of applications.
Abstract: A joint ISO/CCITT committee known as JPEG (Joint Photographic Experts Group) has been working to establish the first international compression standard for continuous-tone still images, both grayscale and color. JPEG's proposed standard aims to be generic, to support a wide variety of applications for continuous-tone images. To meet the differing needs of many applications, the JPEG standard includes two basic compression methods, each with various modes of operation. A DCT (discrete cosine transform)-based method is specified for 'lossy' compression, and a predictive method for 'lossless' compression. JPEG features a simple lossy technique known as the Baseline method, a subset of the other DCT-based modes of operation. The Baseline method has been by far the most widely implemented JPEG method to date, and is sufficient in its own right for a large number of applications. The author provides an overview of the JPEG standard, and focuses in detail on the Baseline method. >

3,425 citations

Book
01 Jan 2007
TL;DR: This book discusses Digital Signal Processing Systems, Pipelining and Parallel Processing, Synchronous, Wave, and Asynchronous Pipelines, and Bit-Level Arithmetic Architectures.
Abstract: Introduction to Digital Signal Processing Systems. Iteration Bound. Pipelining and Parallel Processing. Retiming. Unfolding. Folding. Systolic Architecture Design. Fast Convolution. Algorithmic Strength Reduction in Filters and Transforms. Pipelined and Parallel Recursive and Adaptive Filters. Scaling and Roundoff Noise. Digital Lattice Filter Structures. Bit-Level Arithmetic Architectures. Redundant Arithmetic. Numerical Strength Reduction. Synchronous, Wave, and Asynchronous Pipelines. Low-Power Design. Programmable Digital Signal Processors. Appendices. Index.

1,361 citations

Book
26 Jan 2014
TL;DR: An introduction to the algorithms and architectures that form the underpinnings of the image and video compressions standards, including JPEG, H.261 and H.263, while fully addressing the architecturalconsiderations involved when implementing these standards.
Abstract: From the Publisher: Image and Video Compression Standards: Algorithms and Architectures, Second Edition presents an introduction to thealgorithms and architectures that form the underpinnings of the imageand video compressions standards, including JPEG (compression ofstill-images), H.261 and H.263 (video teleconferencing), and MPEG-1and MPEG-2 (video storage and broadcasting). The next generation ofaudiovisual coding standards, such as MPEG-4 and MPEG-7, are alsobriefly described. In addition, the book covers the MPEG and DolbyAC-3 audio coding standards and emerging techniques for image andvideo compression, such as those based on wavelets and vectorquantization. Image and Video Compression Standards: Algorithms andArchitectures, Second Edition emphasizes the foundations ofthese standards; namely, techniques such as predictive coding,transform-based coding such as the discrete cosine transform (DCT),motion estimation, motion compensation, and entropy coding, as well ashow they are applied in the standards. The implementation details ofeach standard are avoided; however, the book provides all the materialnecessary to understand the workings of each of the compressionstandards, including information that can be used by the reader toevaluate the efficiency of various software and hardwareimplementations conforming to these standards. Particular emphasis isplaced on those algorithms and architectures that have been found tobe useful in practical software or hardware implementations. Image and Video Compression Standards: Algorithms andArchitectures, Second Edition uniquely covers all majorstandards (JPEG, MPEG-1, MPEG-2, MPEG-4, H.261, H.263) in asimple andtutorial manner, while fully addressing the architecturalconsiderations involved when implementing these standards. As such, itserves as a valuable reference for the graduate student, researcher orengineer. The book is also used frequently as a text for courses onthe subject, in both academic and professional settings.

726 citations

Journal ArticleDOI
TL;DR: It is shown that these proposed Bio-inspired Imprecise Computational blocks (BICs) can be exploited to efficiently implement a three-layer face recognition neural network and the hardware defuzzification block of a fuzzy processor.
Abstract: The conventional digital hardware computational blocks with different structures are designed to compute the precise results of the assigned calculations. The main contribution of our proposed Bio-inspired Imprecise Computational blocks (BICs) is that they are designed to provide an applicable estimation of the result instead of its precise value at a lower cost. These novel structures are more efficient in terms of area, speed, and power consumption with respect to their precise rivals. Complete descriptions of sample BIC adder and multiplier structures as well as their error behaviors and synthesis results are introduced in this paper. It is then shown that these BIC structures can be exploited to efficiently implement a three-layer face recognition neural network and the hardware defuzzification block of a fuzzy processor.

458 citations