scispace - formally typeset
Search or ask a question
Author

Ali Ahmadi

Bio: Ali Ahmadi is an academic researcher from University of Texas at Dallas. The author has contributed to research in topics: Cost reduction & Fault tolerance. The author has an hindex of 8, co-authored 23 publications receiving 470 citations. Previous affiliations of Ali Ahmadi include University of Tehran & University of Kashan.

Papers
More filters
Journal ArticleDOI
TL;DR: It is shown that these proposed Bio-inspired Imprecise Computational blocks (BICs) can be exploited to efficiently implement a three-layer face recognition neural network and the hardware defuzzification block of a fuzzy processor.
Abstract: The conventional digital hardware computational blocks with different structures are designed to compute the precise results of the assigned calculations. The main contribution of our proposed Bio-inspired Imprecise Computational blocks (BICs) is that they are designed to provide an applicable estimation of the result instead of its precise value at a lower cost. These novel structures are more efficient in terms of area, speed, and power consumption with respect to their precise rivals. Complete descriptions of sample BIC adder and multiplier structures as well as their error behaviors and synthesis results are introduced in this paper. It is then shown that these BIC structures can be exploited to efficiently implement a three-layer face recognition neural network and the hardware defuzzification block of a fuzzy processor.

458 citations

Proceedings ArticleDOI
22 Jan 2009
TL;DR: A new method for fault-tolerant implementation of neural networks that detects and corrects any single fault in the network and achieves complete fault tolerance for single faults with at most 40% area overhead.
Abstract: Artificial Neural Networks (ANNs) are widely used in computational and industrial applications. As technology is developed the scale of hardware is progressively becoming smaller and the number of faults is increasing. Therefore, fault-tolerant methods are necessary especially for ANNs used in critical applications. In this work, we propose a new method for fault-tolerant implementation of neural networks. In hidden and output layers, we add a spare neuron, and one of hidden and output neurons is tested by each input pattern. Our technique detects and corrects any single fault in the network. We achieve complete fault tolerance for single faults with at most 40% area overhead.

18 citations

Proceedings ArticleDOI
09 May 2012
TL;DR: A computational profiling on signal processing tasks for a typical BCI system is performed and adaptive algorithms that will adjust the computational complexity of the signal processing based on the amount of energy available are investigated, while guaranteeing that the accuracy is minimally compromised.
Abstract: Brain Computer Interface (BCI) is gaining popularity due to recent advances in developing small and compact electronic technology and electrodes. Miniaturization and form factor reduction in particular are the key objectives for Body Sensor Networks (BSNs) and wearable systems that implement BCIs. More complex signal processing techniques have been developed in the past few years for BCI which create further challenges for form factor reduction. In this paper, we perform a computational profiling on signal processing tasks for a typical BCI system. We employ several common feature extraction techniques. We define a cost function based on the computational complexity for each feature dimension and present a sequential feature selection to explore the complexity versus the accuracy. We discuss the trade-offs between the computational cost and the accuracy of the system. This will be useful for emerging mobile, wearable and power-aware BCI systems where the computational complexity, the form factor, the size of the battery and the power consumption are of significant importance. We investigate adaptive algorithms that will adjust the computational complexity of the signal processing based on the amount of energy available, while guaranteeing that the accuracy is minimally compromised. We perform an analysis on a standard inhibition (Go/NoGo) task. We demonstrate while classification accuracy is reduced by 2%, compared to the best classification accuracy obtained, the computational complexity of the system can be reduced by more than 60%. Furthermore, we investigate the performance of our technique on real-time EEG signals provided by an eMotiv® device for a Push/No Push task.

18 citations

Proceedings ArticleDOI
25 Apr 2016
TL;DR: This work introduces a methodology for dynamically selecting whether to subject a wafer to a complete or a reduced probe-test flow, while ensuring that the concomitant test cost savings do not compromise test quality.
Abstract: We introduce a methodology for dynamically selecting whether to subject a wafer to a complete or a reduced probe-test flow, while ensuring that the concomitant test cost savings do not compromise test quality. The granularity of this decision is at the wafer-level and is made before the wafer reaches the probe station, based on an e-test signature which reflects how process variations have affected this particular wafer. While the proposed method may offer less flexibility than approaches that dynamically adapt the test flow on a per-die basis, its implementation is simpler and more compatible with most commonly used Automatic Test Equipment. Furthermore, unlike static test elimination approaches, whose agility is limited by the relative importance of the dropped tests, the proposed method is capable of exploring test cost reduction solutions which maintain very low test escape rates. Decisions are made by an intelligent system which maps every point in the e-test signature space to either the complete or the reduced test flow. Training of the system seeks to maximize the number of wafers subjected to the reduced flow for a given target of test escapes, thereby enabling exploration of the trade-off between test cost reduction and test quality. The proposed method is demonstrated on an industrial dataset of a few million devices from a Texas Instruments RF transceiver.

14 citations

Proceedings ArticleDOI
06 Jul 2015
TL;DR: This work discusses the state-of-the-art methods for predicting workload dynamics and compares their performance, and introduces a prediction method based on Support Vector Regression (SVR), which accurately predicts the workload behavior several steps ahead.
Abstract: As a result of technology scaling, power density of multi-core chips increases and leads to temperature hot-spots which accelerate device aging and chip failure. Moreover, intense efforts to reduce power consumption by employing low-power techniques decrease the reliability of new design generations. Traditionally, reactive thermal/power management techniques have been used to take appropriate action when the temperature reaches a threshold. However, these approaches do not always balance temperature and, as a result, may degrade system reliability. Therefore, to distribute temperature evenly across all cores, a proactive mechanism is needed to forecast future workload characteristics and the corresponding temperature, in order to make decisions before hot spots occur. Such proactive methods rely on an engine to precisely predict future workload characteristics. In this work, we first discuss the state-of-the-art methods for predicting workload dynamics and we compare their performance. We, then, introduce a prediction method based on Support Vector Regression (SVR), which accurately predicts the workload behavior several steps ahead. To evaluate the effectiveness of our approach, we use several programs from the PARSEC benchmark suite on an UltraSPARC T1 processor running the Sun Solaris operating system and we extract architectural traces. Then, the extracted traces are used to generate power and thermal profiles for each core using the McPAT and Hot-Spot simulators. Our results show that the proposed method forecasts workload dynamics and power very accurately and outperforms previous prediction techniques.

12 citations


Cited by
More filters
Journal ArticleDOI
01 Apr 1988-Nature
TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

9,929 citations

Proceedings ArticleDOI
27 May 2013
TL;DR: This paper reviews recent progress in the area, including design of approximate arithmetic blocks, pertinent error and quality measures, and algorithm-level techniques for approximate computing.
Abstract: Approximate computing has recently emerged as a promising approach to energy-efficient design of digital systems. Approximate computing relies on the ability of many systems and applications to tolerate some loss of quality or optimality in the computed result. By relaxing the need for fully precise or completely deterministic operations, approximate computing techniques allow substantially improved energy efficiency. This paper reviews recent progress in the area, including design of approximate arithmetic blocks, pertinent error and quality measures, and algorithm-level techniques for approximate computing.

921 citations

Journal ArticleDOI
TL;DR: This paper proposes logic complexity reduction at the transistor level as an alternative approach to take advantage of the relaxation of numerical accuracy, and demonstrates the utility of these approximate adders in two digital signal processing architectures with specific quality constraints.
Abstract: Low power is an imperative requirement for portable multimedia devices employing various signal processing algorithms and architectures. In most multimedia applications, human beings can gather useful information from slightly erroneous outputs. Therefore, we do not need to produce exactly correct numerical outputs. Previous research in this context exploits error resiliency primarily through voltage overscaling, utilizing algorithmic and architectural techniques to mitigate the resulting errors. In this paper, we propose logic complexity reduction at the transistor level as an alternative approach to take advantage of the relaxation of numerical accuracy. We demonstrate this concept by proposing various imprecise or approximate full adder cells with reduced complexity at the transistor level, and utilize them to design approximate multi-bit adders. In addition to the inherent reduction in switched capacitance, our techniques result in significantly shorter critical paths, enabling voltage scaling. We design architectures for video and image compression algorithms using the proposed approximate arithmetic units and evaluate them to demonstrate the efficacy of our approach. We also derive simple mathematical models for error and power consumption of these approximate adders. Furthermore, we demonstrate the utility of these approximate adders in two digital signal processing architectures (discrete cosine transform and finite impulse response filter) with specific quality constraints. Simulation results indicate up to 69% power savings using the proposed approximate adders, when compared to existing implementations using accurate adders.

637 citations

Journal ArticleDOI
TL;DR: New metrics are proposed for evaluating the reliability as well as the power efficiency of approximate and probabilistic adders and it is shown that the MED is an effective metric for measuring the implementation accuracy of a multiple-bit adder and that the NED is a nearly invariant metric independent of the size of an adder.
Abstract: Addition is a fundamental function in arithmetic operation; several adder designs have been proposed for implementations in inexact computing. These adders show different operational profiles; some of them are approximate in nature while others rely on probabilistic features of nanoscale circuits. However, there has been a lack of appropriate metrics to evaluate the efficacy of various inexact designs. In this paper, new metrics are proposed for evaluating the reliability as well as the power efficiency of approximate and probabilistic adders. Reliability is analyzed using the so-called sequential probability transition matrices (SPTMs). Error distance (ED) is initially defined as the arithmetic distance between an erroneous output and the correct output for a given input. The mean error distance (MED) and normalized error distance (NED) are then proposed as unified figures that consider the averaging effect of multiple inputs and the normalization of multiple-bit adders. It is shown that the MED is an effective metric for measuring the implementation accuracy of a multiple-bit adder and that the NED is a nearly invariant metric independent of the size of an adder. The MED is, therefore, useful in assessing the effectiveness of an approximate or probabilistic adder implementation, while the NED is useful in characterizing the reliability of a specific design. Since inexact adders are often used for saving power, the product of power and NED is further utilized for evaluating the tradeoffs between power consumption and precision. Although illustrated using adders, the proposed metrics are potentially useful in assessing other arithmetic circuit designs for applications of inexact computing.

453 citations

Journal ArticleDOI
TL;DR: The results show that the proposed designs accomplish significant reductions in power dissipation, delay and transistor count compared to an exact design; moreover, two of the proposed multiplier designs provide excellent capabilities for image multiplication with respect to average normalized error distance and peak signal-to-noise ratio.
Abstract: Inexact (or approximate) computing is an attractive paradigm for digital processing at nanometric scales. Inexact computing is particularly interesting for computer arithmetic designs. This paper deals with the analysis and design of two new approximate 4-2 compressors for utilization in a multiplier. These designs rely on different features of compression, such that imprecision in computation (as measured by the error rate and the so-called normalized error distance) can meet with respect to circuit-based figures of merit of a design (number of transistors, delay and power consumption). Four different schemes for utilizing the proposed approximate compressors are proposed and analyzed for a Dadda multiplier. Extensive simulation results are provided and an application of the approximate multipliers to image processing is presented. The results show that the proposed designs accomplish significant reductions in power dissipation, delay and transistor count compared to an exact design; moreover, two of the proposed multiplier designs provide excellent capabilities for image multiplication with respect to average normalized error distance and peak signal-to-noise ratio (more than 50 dB for the considered image examples).

447 citations