scispace - formally typeset
Search or ask a question
Author

Hussam Amrouch

Bio: Hussam Amrouch is an academic researcher from University of Stuttgart. The author has contributed to research in topics: Computer science & Transistor. The author has an hindex of 18, co-authored 133 publications receiving 1113 citations. Previous affiliations of Hussam Amrouch include Karlsruhe Institute of Technology & Southern Illinois University Carbondale.


Papers
More filters
Proceedings ArticleDOI
05 Jun 2016
TL;DR: It is demonstrated that degradation-aware libraries and tool flows are indispensable for not only accurately estimating guardbands, but also efficiently containing them and that aging can be effectively suppressed.
Abstract: Due to aging, circuit reliability has become extraordinary challenging. Reliability-aware circuit design flows do virtually not exist and even research is in its infancy. In this paper, we propose to bring aging awareness to EDA tool flows based on so-called degradation-aware cell libraries. These libraries include detailed delay information of gates/cells under the impact that aging has on both threshold voltage (V th ) and carrier mobility (μ) of transistors. This is unlike state of the art which considers V th only. We show how ignoring s degradation leads to underestimating guard-bands by 19% on average. Our investigation revealed that the impact of aging is strongly dependent on the operating conditions of gates (i.e. input signal slew and output load capacitance), and not solely on the duty cycle of transistors. Neglecting this fact results in employing insufficient guard-bands and thus not sustaining reliability during lifetime. We demonstrate that degradation-aware libraries and tool flows are indispensable for not only accurately estimating guardbands, but also efficiently containing them. By considering aging degradations during logic synthesis, significantly more resilient circuits can be obtained. We further quantify the impact of aging on the degradation of image processing circuits. This goes far beyond investigating aging with respect to path delays solely. We show that in a standard design without any guardbanding, aging leads to unacceptable image quality after just one year. By contrast, if the synthesis tool is provided with the degradation-aware cell library, high image quality is sustained for 10 years (even under worst-case aging and without a guardband). Hence, using our approach, aging can be effectively suppressed.

92 citations

Journal ArticleDOI
TL;DR: This paper investigates how the NCFET technology can open the doors not only for the continuation of Moore's law, which is approaching its end, but also for reviving Dennard’s scaling, which stopped more than a decade ago.
Abstract: Negative capacitance field-effect transistor (NCFET) addresses one of the key fundamental limits in technology scaling, akin to the non-scalable Boltzmann factor, by offering a sub-threshold swing below 60 mV/decade. In this paper, we investigate how the NCFET technology can open the doors not only for the continuation of Moore’s law, which is approaching its end, but also for reviving Dennard’s scaling, which stopped more than a decade ago. We study NCFET for the 7-nm FinFET technology node, from physics to processors and demonstrate that prior trends in processor design with respect to voltage and frequency can be revived with the NCFET technology. Our work focuses on answering the following three questions towards drawing the impact of NCFET technology on computing efficiency: In how far NCFET technology will enable processors: 1) to operate at higher frequencies without increasing voltage; 2) to operate at higher frequencies without increasing power density, which is substantial, because maintaining on-chip power densities under tight constraints due to limited cooling capabilities is inevitable; and 3) to operate at lower voltages, while still fulfilling performance requirements, which is substantial for the emerging Internet of Things, in which available power budgets for such devices are typically very restricted.

72 citations

Proceedings ArticleDOI
03 Nov 2014
TL;DR: It is shown that the overall aging can be modeled as a superposition of the interdependent aging effects, and it is demonstrated that estimating reliability due to an individual dominant aging mechanism together with solely considering a single kind of failures, as currently is a main focus of state-of-the-art, can result in 75% underestimation on average.
Abstract: With technology in deep nano scale, the susceptibility of transistors to various aging mechanisms such as Negative/ Positive Bias Temperature Instability (NBTI/PBTI) and Hot Carrier Induced Degradation (HCID) etc. is increasing. As a matter of fact, different aging mechanisms simultaneously occur in the gate dielectric of a transistor. In addition, scaling in conjunction with high-K materials has made aging mechanisms, that have often been assumed to be negligible (e.g., PBTI in NMOS and HCID in PMOS), become noticeable. Therefore, in this paper we investigate the key challenge of providing designers with an abstracted, yet accurate reliability estimation that combines, from the physical to system level, the effects of multiple simultaneous aging mechanisms and their interdependencies. We show that the overall aging can be modeled as a superposition of the interdependent aging effects. Our presented model deviates by around 6% from recent industrial physical measurements. We conclude from our experiments that an isolated treatment of individual aging mechanisms is insufficient to devise effective mitigation strategies in current and upcoming technology nodes. We also demonstrate that estimating reliability due to an individual dominant aging mechanism together with solely considering a single kind of failures, as currently is a main focus of state-of-the-art (e.g., [28], [22]), can result in 75% underestimation on average.

72 citations

Proceedings ArticleDOI
05 Jun 2016
TL;DR: This work proposes a control-theory based dynamic thermal management technique that cooperatively scales CPU and GPU frequencies to meet the thermal constraint while achieving high performance for mobile gaming.
Abstract: State-of-the-art thermal management techniques independently throttle the frequencies of high-performance multi-core CPU and powerful graphics processing units (GPU) on heterogeneous multiprocessor system-on-chips deployed in latest mobile devices. For graphics-intensive gaming applications, this approach is inadequate because both the CPU and the GPU contribute towards the overall application performance (frames per second or FPS) as well as the on-chip temperature. The lack of coordination between CPU and GPU induces recurrent frequency throttling to maintain on-chip temperature below the permissible limit. This leads to significantly degraded application performance and large variation in temperature over time. We propose a control-theory based dynamic thermal management technique that cooperatively scales CPU and GPU frequencies to meet the thermal constraint while achieving high performance for mobile gaming. Experimental results with six popular Android games on a commercial mobile platform show an average 19% performance improvement and over 90% reduction in temperature variance compared to the original Linux approach.

66 citations

Journal ArticleDOI
TL;DR: A time-efficient automated framework for mapping the NN weights to the accuracy levels of the approximate reconfigurable accelerator that is able to satisfy tight accuracy loss thresholds, while significantly reducing energy consumption without any need for intensive NN retraining is proposed.
Abstract: Current research in the area of Neural Networks (NN) has resulted in performance advancements for a variety of complex problems. Especially, embedded system applications rely more and more on the utilization of convolutional NNs to provide services such as image/audio classification and object detection. The core arithmetic computation performed during NN inference is the multiply-accumulate (MAC) operation. In order to meet tighter and tighter throughput constraints, NN accelerators integrate thousands of MAC units resulting in a significant increase in power consumption. Approximate computing is established as a design alternative to improve the efficiency of computing systems by trading computational accuracy for high energy savings. In this work, we bring approximate computing principles and NN inference together by designing NN specific approximate multipliers that feature multiple accuracy levels at run-time. We propose a time-efficient automated framework for mapping the NN weights to the accuracy levels of the approximate reconfigurable accelerator. The proposed weight-oriented approximation mapping is able to satisfy tight accuracy loss thresholds, while significantly reducing energy consumption without any need for intensive NN retraining. Our approach is evaluated against several NNs demonstrating that it delivers high energy savings (17.8% on average) with a minimal loss in inference accuracy (0.5%).

56 citations


Cited by
More filters
01 Jan 2010
TL;DR: This paper proposes BlueChip, a defensive strategy that has both a design-time component and a runtime component that is able to prevent all hardware attacks the authors evaluate while incurring a small runtime overhead.
Abstract: The computer systems security arms race between attackers and defenders are largely taken place in the domain of software systems, but as hardware complexity and design processes have envolved, novel and potent hardware-based security threats are now possible. This article presents Unused Circuit Identification (UCI), an approach for detecting suspicious circuits during design time, and BlueChip, a hybrid hardware/software approach to detaching suspicious circuits and making up for UCI classifier errors during runtime.

220 citations