scispace - formally typeset
Search or ask a question
Author

Saman Kiamehr

Bio: Saman Kiamehr is an academic researcher from Bosch. The author has contributed to research in topics: Negative-bias temperature instability & Very-large-scale integration. The author has an hindex of 18, co-authored 71 publications receiving 793 citations. Previous affiliations of Saman Kiamehr include Karlsruhe Institute of Technology.

Papers published on a yearly basis

Papers
More filters
Proceedings ArticleDOI
18 Nov 2013
TL;DR: An aging-aware logic synthesis approach is proposed to increase circuit lifetime with respect to a specific guardband and shows that the proposed approach improves circuit lifetime in average by more than 3X with negligible impact on area.
Abstract: As CMOS technology scales down into the nanometer regime, designers have to add pessimistic timing margins to the circuit as guardbands to avoid timing violations due to various reliability effects, in particular accelerated transistor aging. Since aging is workload-dependent, the aging rates of different paths are non-uniform, and hence, design time delay-balanced circuits become significantly unbalanced after some operational time. In this paper, an aging-aware logic synthesis approach is proposed to increase circuit lifetime with respect to a specific guardband. Our main objective is to optimize the design timing with respect to post-aging delay in a way that all paths reach the assigned guardband at the same time. In this regard, in an iterative process, after computing the post-aging delays, the lifetime is improved by putting tighter timing constraints on paths with higher aging rate and looser constraints on paths which have less post-aging delay than the desired guarband. The experimental results shows that the proposed approach improves circuit lifetime in average by more than 3X with negligible impact on area. Our approach is implemented on top of a commercial synthesis toolchain, and hence scales very well.

57 citations

Proceedings ArticleDOI
01 Dec 2016
TL;DR: This work implemented and calibrated sensors in configurable logic appropriate to observe delay changes caused by transient voltage fluctuations, and places them at multiple locations on the chip to evaluate temporal and spatial changes in timing margin due to different workload-characteristics.
Abstract: Due to recent technology scaling trends and increased circuit complexity, process and runtime variabilities are becoming major threats for correct circuit operation Among these, transient voltage fluctuations appear to be the most critical issue, accounting for the biggest component of timing margin, at increased cost As various design and workload parameters have an impact on voltage fluctuations, they need to be fully understood in order to design efficient countermeasures and margining FPGAs are predestined for this analysis by allowing more control over such experiments at lower cost than ASICs Even more, they highly suffer from the same issues, which are typically only handled by excessive and over-pessimistic timing margining built into the mapping tools In this work, we implemented and calibrated sensors in configurable logic appropriate to observe delay changes caused by transient voltage fluctuations We place them at multiple locations on the chip to evaluate temporal and spatial changes in timing margin due to different workload-characteristics This analysis provides useful insights to designers for application mapping and workload scheduling

43 citations

Proceedings ArticleDOI
12 Mar 2012
TL;DR: This work obtains the instruction, along with the operands, with minimal NBTI degradation, to be used as NOP, and proposed two methods, software-based and hardware-based, to replace the original NOP with this maximum aging reduction NOP.
Abstract: Negative Bias Temperature Instability (NBTI) is a major source of transistor aging in scaled CMOS, resulting in slower devices and shorter lifetime. NBTI is strongly dependent on the input vector. Moreover, a considerable fraction of execution time of an application is spent to execute NOP (No Operation) instructions. Based on these observations, we present a novel NOP assignment to minimize NBTI effect, i.e. maximum NBTI relaxation, on the processors. Our analysis shows that NBTI degradation is more impacted by the source operands rather than instruction opcodes. Given this, we obtain the instruction, along with the operands, with minimal NBTI degradation, to be used as NOP. We also proposed two methods, software-based and hardware-based, to replace the original NOP with this maximum aging reduction NOP. Experimental results based on SPEC2000 applications running on a MIPS processor show that this method can extend the lifetime by 37% in average while the overhead is negligible.

42 citations

Proceedings ArticleDOI
12 Mar 2015
TL;DR: A low cost self-controlled bit-flipping technique which inverts all bit positions with respect to an existing bit is proposed which is applied to a register-file and cache units of an embedded microprocessor and results show that the reliability of the proposed technique is similar to that of existing bit-Flipping techniques, while imposing 64% less area overhead.
Abstract: With CMOS technology downscaling into the nanometer regime, the reliability of SRAM memories is threatened by accelerated transistor aging mechanisms such as Bias Temperature Instability (BTI). BTI leads to a considerable degradation of SRAM cell Static Noise Margin (SNM), which increases the memory failure rate. Since BTI is workload dependent, the aging rates of different cells in a memory array are quite non-uniform. To address this issue, a variety of bit-flipping techniques has been proposed to decrease the SNM degradation by balancing the signal probabilities of the cells. However, existing bit-flipping techniques impose too much area and power overhead as at least an additional column is required to store the inversion flags. In this paper, we propose a low cost self-controlled bit-flipping technique which inverts all bit positions with respect to an existing bit. This technique is applied to a register-file and cache units of an embedded microprocessor. Our simulation results show that the reliability of the proposed technique is similar to that of existing bit-flipping techniques, while imposing 64% less area overhead.

41 citations

Journal ArticleDOI
TL;DR: This paper presents an efficient input vector selection technique based on linear programming for cooptimizing the NBTI-induced delay degradation and leakage power consumption during standby mode and provides a pareto curve based on both phenomena.
Abstract: Transistor aging is a major reliability concern for nanoscale CMOS technology that can significantly reduce the operation lifetime of very large-scale integration chips. Negative bias temperature instability (NBTI) is a major contributor to transistor aging that affects pMOS transistors. On the other hand, leakage power is becoming a dominant factor of the total power with successive technology scaling. Since the input combinations applied to a logic core have a significant impact on both NBTI and leakage power, input vector control can be used to optimize both phenomena during idle cycles. In this paper, we present an efficient input vector selection technique based on linear programming for cooptimizing the NBTI-induced delay degradation and leakage power consumption during standby mode. Since the NBTI-induced delay degradation and leakage power are not affected by the input vector in the same direction, we provide a pareto curve based on both phenomena. A suitable point from such a pareto curve is chosen based on circuit conditions and requirements during runtime.

34 citations


Cited by
More filters
Proceedings ArticleDOI
20 May 2018
TL;DR: This work introduces and demonstrates remote power side-channel attacks using an FPGA, showing that the common assumption that powerSideChannel attacks require specialized equipment and physical access to the victim hardware is not true for systems with an integrated FPGAs.
Abstract: The rapid adoption of heterogeneous computing has driven the integration of Field Programmable Gate Arrays (FPGAs) into cloud datacenters and flexible System-on-Chips (SoCs). This paper shows that the integrated FPGA introduces a new security vulnerability by enabling software-based power side-channel attacks without physical proximity to a target system. We first demonstrate that an on-chip power monitor can be built on a modern FPGA using ring oscillators (ROs), and characterize its ability to observe the power consumption of other modules on the FPGA or the SoC. Then, we show that the RO-based FPGA power monitor can be used for a successful power analysis attack on an RSA cryptomodule on the same FPGA. Additionally, we show that the FPGA-based power monitor can observe the power consumption of a CPU on the same SoC, and demonstrate that the FPGA-to-CPU power side-channel attack can break timing-channel protection for a RSA program running on a CPU. This work introduces and demonstrates remote power side-channel attacks using an FPGA, showing that the common assumption that power side-channel attacks require specialized equipment and physical access to the victim hardware is not true for systems with an integrated FPGA.

223 citations

Journal ArticleDOI
Florian Zaruba1, Luca Benini1
TL;DR: A thorough power, performance, and efficiency analysis of the RISC-V ISA targeting baseline “application class” functionality, i.e., supporting the Linux OS and its application environment based on the authors' open-source single-issue in-order implementation of the 64-bit ISA variant (RV64GC) called Ariane.
Abstract: The open-source RISC-V instruction set architecture (ISA) is gaining traction, both in industry and academia. The ISA is designed to scale from microcontrollers to server-class processors. Furthermore, openness promotes the availability of various open-source and commercial implementations. Our main contribution in this paper is a thorough power, performance, and efficiency analysis of the RISC-V ISA targeting baseline “application class” functionality, i.e., supporting the Linux OS and its application environment based on our open-source single-issue in-order implementation of the 64-bit ISA variant (RV64GC) called Ariane. Our analysis is based on a detailed power and efficiency analysis of the RISC-V ISA extracted from silicon measurements and calibrated simulation of an Ariane instance (RV64IMC) taped-out in GlobalFoundries 22FDX technology. Ariane runs at up to 1.7-GHz, achieves up to 40-Gop/sW energy efficiency, which is superior to similar cores presented in the literature. We provide insight into the interplay between functionality required for the application-class execution (e.g., virtual memory, caches, and multiple modes of privileged operation) and energy cost. We also compare Ariane with RISCY, a simpler and a slower microcontroller-class core. Our analysis confirms that supporting application-class execution implies a nonnegligible energy-efficiency loss and that compute performance is more cost-effectively boosted by instruction extensions (e.g., packed SIMD) rather than the high-frequency operation.

195 citations

Journal ArticleDOI
TL;DR: The role of ML in IoT from the cloud down to embedded devices is reviewed and the state-of-the-art usages are categorized according to their application domain, input data type, exploited ML techniques, and where they belong in the cloud-to-things continuum.
Abstract: With the numerous Internet of Things (IoT) devices, the cloud-centric data processing fails to meet the requirement of all IoT applications. The limited computation and communication capacity of the cloud necessitate the edge computing, i.e., starting the IoT data processing at the edge and transforming the connected devices to intelligent devices . Machine learning (ML) the key means for information inference, should extend to the cloud-to-things continuum too. This paper reviews the role of ML in IoT from the cloud down to embedded devices. Different usages of ML for application data processing and management tasks are studied. The state-of-the-art usages of ML in IoT are categorized according to their application domain, input data type, exploited ML techniques, and where they belong in the cloud-to-things continuum. The challenges and research trends toward efficient ML on the IoT edge are discussed. Moreover, the publications on the “ML in IoT” are retrieved and analyzed systematically using ML classification techniques. Then, the growing topics and application domains are identified.

157 citations