Author
Veit B. Kleeberger
Other affiliations: Technische Universität München
Bio: Veit B. Kleeberger is an academic researcher from Infineon Technologies. The author has contributed to research in topics: Fault injection & Resilience (network). The author has an hindex of 10, co-authored 30 publications receiving 291 citations. Previous affiliations of Veit B. Kleeberger include Technische Universität München.
Papers
More filters
••
29 May 2013
TL;DR: This paper presents an approach which is able to find the optimal sizing of basic circuit blocks considering process variation, and utilizes this approach to predict the impact of scaling in FinFET technologies and the influence of process variations in future technology nodes.
Abstract: With continued scaling of CMOS technology it becomes increasingly difficult to maintain reliable circuits. Early predictive technology and design exploration help to understand major effects of variability sources and their impact on circuit performances. With each new technology basic circuit blocks have to be redesigned to appropriately evaluate the impact of technology scaling. Therefore, this paper presents an approach which is able to find the optimal sizing of basic circuit blocks considering process variation. We utilize this approach to predict the impact of scaling in FinFET technologies and the influence of process variations in future technology nodes.
50 citations
••
TL;DR: A model for NBTI degradation and recovery based on trapping/detrapping is developed which accurately describes the relaxation during detrapping, the quasi-permanent degradation and shows good agreement with measurements from a 65 nm technology.
36 citations
••
TL;DR: This paper shows by example how probabilistic bit flips are systematically abstracted and propagated towards higher abstraction levels up to the application software layer, and how RAP can be used to parameterize architecture-level resilience methods.
28 citations
••
TL;DR: This article illustrates a methodology for dealing with scaling- related problems via two case studies that link models of low-level technology-related problems to system behavior, which spreads the burden of ensuring resilience across multiple levels of the design hierarchy.
Abstract: Highly scaled technologies at and beyond the 22-nm node exhibit increased sensitivity to various scaling-related problems that conspire to reduce the overall reliability of integrated circuits and systems. In prior technology nodes, the assumption was that manufacturing technology was responsible for ensuring device reliability. This basic assumption is no longer tenable. Trying to contain reliability problems purely at the technology level would cause prohibitive increases in power consumption. Thus, a cross-layer approach is required, which spreads the burden of ensuring resilience across multiple levels of the design hierarchy. This article illustrates a methodology for dealing with scaling-related problems via two case studies that link models of low-level technology-related problems to system behavior.
27 citations
••
01 Jun 2014TL;DR: An enhanced static timing analysis is presented which links technology-level effects to system-level and vice versa and discusses the accurate and efficient consideration of system workload and impact of executed instructions on circuit timing.
Abstract: In today's design of resilient embedded systems, logic circuit components play a key role. Many possible design choices at the gate level, such as implementation architecture or synthesis constraints, are vital for the resilience of the entire system. Hence, EDA algorithms at this level have to support exposing technology characteristics (such as process variations or aging) for consideration on higher levels of abstraction. Similarly, key parameters from system level, such as workload or executed processor instructions, have to be considered at lower levels for accurate analysis of, e.g., degradation effects. Circuit-level timing analysis plays a key role in this context as it provides key metrics such as achievable frequency, available timing margins and timing violation vulnerabilities of the analyzed circuit. We present an enhanced static timing analysis which links technology-level effects to system-level and vice versa. Specifically, we discuss the accurate and efficient consideration of system workload and impact of executed instructions on circuit timing.
20 citations
Cited by
More filters
••
Argonne National Laboratory1, Intel2, University of Texas at Austin3, University of Illinois at Urbana–Champaign4, Purdue University5, Lawrence Livermore National Laboratory6, IBM7, University of Chicago8, Los Alamos National Laboratory9, Information Sciences Institute10, Oak Ridge National Laboratory11, Booz Allen Hamilton12, Science Applications International Corporation13, Pacific Northwest National Laboratory14, Advanced Micro Devices15, Stanford University16, Hewlett-Packard17, Sandia National Laboratories18
TL;DR: This report presents a report produced by a workshop on ‘Addressing failures in exascale computing’ held in Park City, Utah, 4–11 August 2012, which summarizes and builds on discussions on resilience.
Abstract: We present here a report produced by a workshop on 'Addressing failures in exascale computing' held in Park City, Utah, 4-11 August 2012. The charter of this workshop was to establish a common taxonomy about resilience across all the levels in a computing system, discuss existing knowledge on resilience across the various hardware and software layers of an exascale system, and build on those results, examining potential solutions from both a hardware and software perspective and focusing on a combined approach.
The workshop brought together participants with expertise in applications, system software, and hardware; they came from industry, government, and academia, and their interests ranged from theory to implementation. The combination allowed broad and comprehensive discussions and led to this document, which summarizes and builds on those discussions.
406 citations
••
29 May 2013TL;DR: In this article, the authors introduce the most prominent reliability concerns from today's points of view and roughly recapitulate the progress in the community so far and suggest a way for coping with reliability challenges in upcoming technology nodes.
Abstract: Reliability concerns due to technology scaling have been a major focus of researchers and designers for several technology nodes. Therefore, many new techniques for enhancing and optimizing reliability have emerged particularly within the last five to ten years. This perspective paper introduces the most prominent reliability concerns from today's points of view and roughly recapitulates the progress in the community so far. The focus of this paper is on perspective trends from the industrial as well as academic points of view that suggest a way for coping with reliability challenges in upcoming technology nodes.
197 citations
••
Karlsruhe Institute of Technology1, University of Ulm2, Goethe University Frankfurt3, Technische Universität München4, Technical University of Dortmund5, Braunschweig University of Technology6, Dresden University of Technology7, University of Erlangen-Nuremberg8, University of Paderborn9, University of Tübingen10, Kaiserslautern University of Technology11, University of Stuttgart12
TL;DR: An overview of a major research project on dependable embedded systems that has started in Fall 2010 and is running for a projected duration of six years is presented, including a new classification on faults, errors, and failures.
Abstract: The paper presents an overview of a major research project on dependable embedded systems that has started in Fall 2010 and is running for a projected duration of six years. Aim is a ‘dependability co-design’ that spans various levels of abstraction in the design process of embedded systems starting from gate level through operating system, applications software to system architecture. In addition, we present a new classification on faults, errors, and failures.
99 citations
••
TL;DR: A System-on-Chip perspective is used to show how the CyberPhysical System- on-Chip (CPSoC) exemplar platform achieves self-awareness through a combination of cross-layer sensing, actuation, self-aware adaptations, and online learning.
Abstract: Embedded systems must address a multitude of potentially conflicting design constraints such as resiliency, energy, heat, cost, performance, security, etc., all in the face of highly dynamic operational behaviors and environmental conditions. By incorporating elements of intelligence, the hope is that the resulting “smart” embedded systems will function correctly and within desired constraints in spite of highly dynamic changes in the applications and the environment, as well as in the underlying software/hardware platforms. Since terms related to “smartness” (e.g., self-awareness, self-adaptivity, and autonomy) have been used loosely in many software and hardware computing contexts, we first present a taxonomy of “self-x” terms and use this taxonomy to relate major “smart” software and hardware computing efforts. A major attribute for smart embedded systems is the notion of self-awareness that enables an embedded system to monitor its own state and behavior, as well as the external environment, so as to adapt intelligently. Toward this end, we use a System-on-Chip perspective to show how the CyberPhysical System-on-Chip (CPSoC) exemplar platform achieves self-awareness through a combination of cross-layer sensing, actuation, self-aware adaptations, and online learning. We conclude with some thoughts on open challenges and research directions.
82 citations
••
TL;DR: A comprehensive review of the reliability of EVs’ components from different points of view is presented and the challenges and future perspective of EVs relating to the reliability and safety, which need to be considered have been investigated.
77 citations