scispace - formally typeset
Search or ask a question
Author

Mohamed M. Sabry

Bio: Mohamed M. Sabry is an academic researcher from École Polytechnique Fédérale de Lausanne. The author has contributed to research in topics: Computer cooling & Efficient energy use. The author has an hindex of 13, co-authored 33 publications receiving 524 citations. Previous affiliations of Mohamed M. Sabry include École Polytechnique & Stanford University.

Papers
More filters
Proceedings ArticleDOI
01 Dec 2016
TL;DR: Hard-error analysis shows the HD architecture is amazingly resilient to RRAM endurance failures, making the use of various types of RRAMs/CBRAMs feasible, and Multiplication-addition-permutation (MAP), the central operations of HD computing, are experimentally demonstrated.
Abstract: The ability to learn from few examples, known as one-shot learning, is a hallmark of human cognition. Hyperdimensional (HD) computing is a brain-inspired computational framework capable of one-shot learning, using random binary vectors with high dimensionality. Device-architecture co-design of HD cognitive computing systems using 3D VRRAM/CMOS is presented for language recognition. Multiplication-addition-permutation (MAP), the central operations of HD computing, are experimentally demonstrated on 4-layer 3D VRRAM/FinFET as non-volatile in-memory MAP kernels. Extensive cycle-to-cycle (up to 1012 cycles) and wafer-level device-to-device (256 RRAMs) experiments are performed to validate reproducibility and robustness. For 28-nm node, the 3D in-memory architecture reduces total energy consumption by 52.2% with 412 times less area compared with LP digital design (using registers as memory), owing to the energy-efficient VRRAM MAP kernels and dense connectivity. Meanwhile, the system trained with 21 samples texts achieves 90.4% accuracy recognizing 21 European languages on 21,000 test sentences. Hard-error analysis shows the HD architecture is amazingly resilient to RRAM endurance failures, making the use of various types of RRAMs/CBRAMs (1k ∼ 10M endurance) feasible.

107 citations

Journal ArticleDOI
TL;DR: A systematic classification of approaches that increase system resilience in the presence of functional hardware (HW)-induced errors is presented, dealing with higher system abstractions, such as the (micro)architecture, the mapping, and platform software (SW).
Abstract: Nanoscale technology nodes bring reliability concerns back to the center stage of digital system design. A systematic classification of approaches that increase system resilience in the presence of functional hardware (HW)-induced errors is presented, dealing with higher system abstractions, such as the (micro)architecture, the mapping, and platform software (SW). The field is surveyed in a systematic way based on nonoverlapping categories, which add insight into the ongoing work by exposing similarities and differences. HW and SW solutions are discussed in a similar fashion so that interrelationships become apparent. The presented categories are illustrated by representative literature examples to illustrate their properties. Moreover, it is demonstrated how hybrid schemes can be decomposed into their primitive components.

103 citations

Journal ArticleDOI
TL;DR: It is shown how technology, RRAM programing-, and system resilience-level solutions can be effectively combined to design new generations of energy-efficient computing systems that can successfully run deep learning applications despite TWFs and PWFs.
Abstract: Limited endurance of resistive RAM (RRAM) is a major challenge for future computing systems. Using thorough endurance tests that incorporate fine-grained read operations at the array level, we quantify for the first time temporary write failures (TWFs) caused by intrinsic RRAM cycle-to-cycle and cell-to-cell variations. We also quantify permanent write failures (PWFs) caused by irreversible breakdown/dissolution of the conductive filament. We show how technology-, RRAM programing-, and system resilience-level solutions can be effectively combined to design new generations of energy-efficient computing systems that can successfully run deep learning (and other machine learning) applications despite TWFs and PWFs. We analyze corresponding system lifetimes and TWF bit error ratio.

42 citations

Proceedings ArticleDOI
09 Mar 2015
TL;DR: This work presents an overview of the progress toward realizing monolithic 3D ICs, enabled by recent advances in emerging nanotechnologies such as carbon nanotube field-effect transistors and emerging memory technologies such as Resistive RAMs and Spin-Transfer Torque RAMs.
Abstract: Monolithic three-dimensional (3D) integration enables revolutionary digital system architectures of computation immersed in memory. Vertically-stacked layers of logic circuits and memories, with nano-scale inter-layer vias (with the same pitch and dimensions as tight-pitched metal layer vias), provide massive connectivity between the layers. The nano-scale inter-layer vias are orders of magnitude denser than conventional through silicon vias (TSVs). Such digital system architectures can achieve significant performance and energy efficiency benefits compared to today's designs. The massive vertical connectivity makes such architectures particularly attractive for abundant-data applications that impose stringent requirements with respect to low-latency data processing, high-bandwidth data transfer, and energy-efficient storage of massive amounts of data. We present an overview of our progress toward realizing monolithic 3D ICs, enabled by recent advances in emerging nanotechnologies such as carbon nanotube field-effect transistors and emerging memory technologies such as Resistive RAMs and Spin-Transfer Torque RAMs.

38 citations

Proceedings ArticleDOI
24 Mar 2014
TL;DR: A global control scheme which tackles the concerns on the stability of enterprise servers while reducing the performance degradation caused by the variable fan speed control scheme and guarantees the server stability while minimizing the overall performance degradation.
Abstract: Time lag and quantization in temperature sensors in enterprise servers lead to stability concerns on existing variable fan speed control schemes. Stability challenges become further aggravated when multiple local controllers are running together with the fan control scheme. In this paper, we present a global control scheme which tackles the concerns on the stability of enterprise servers while reducing the performance degradation caused by the variable fan speed control scheme. We first present a stable fan speed control scheme based on the Proportional-Integral-Derivative (PID) controller by adaptively adjusting the PID parameters according to the operating fan speed and eliminating the fan speed oscillation caused by temperature quantization. Then, we present a global control scheme which coordinates control actions among multiple local controllers. In addition, it guarantees the server stability while minimizing the overall performance degradation. We validated the proposed control scheme using a presently shipping commercial enterprise server. Our experimental results show that the proposed fan control scheme is stable under the non-ideal temperature measurement system (10 sec in time lag and 1°C in quantization figures). Furthermore, the global control scheme enables to run multiple local controllers in a stable manner while reducing the performance degradation up to 19.2% compared to conventional coordination schemes with 19.1% savings in power consumption.

28 citations


Cited by
More filters
Journal ArticleDOI
01 Jan 2018
TL;DR: The state of the art in memristor-based electronics is evaluated and the future development of such devices in on-chip memory, biologically inspired computing and general-purpose in-memory computing is explored.
Abstract: A memristor is a resistive device with an inherent memory. The theoretical concept of a memristor was connected to physically measured devices in 2008 and since then there has been rapid progress in the development of such devices, leading to a series of recent demonstrations of memristor-based neuromorphic hardware systems. Here, we evaluate the state of the art in memristor-based electronics and explore where the future of the field lies. We highlight three areas of potential technological impact: on-chip memory and storage, biologically inspired computing and general-purpose in-memory computing. We analyse the challenges, and possible solutions, associated with scaling the systems up for practical applications, and consider the benefits of scaling the devices down in terms of geometry and also in terms of obtaining fundamental control of the atomic-level dynamics. Finally, we discuss the ways we believe biology will continue to provide guiding principles for device innovation and system optimization in the field. This Perspective evaluates the state of the art in memristor-based electronics and explores the future development of such devices in on-chip memory, biologically inspired computing and general-purpose in-memory computing.

1,231 citations

Journal ArticleDOI
TL;DR: An in-depth study of the existing literature on data center power modeling, covering more than 200 models, organized in a hierarchical structure with two main branches focusing on hardware-centric and software-centric power models.
Abstract: Data centers are critical, energy-hungry infrastructures that run large-scale Internet-based services. Energy consumption models are pivotal in designing and optimizing energy-efficient operations to curb excessive energy consumption in data centers. In this paper, we survey the state-of-the-art techniques used for energy consumption modeling and prediction for data centers and their components. We conduct an in-depth study of the existing literature on data center power modeling, covering more than 200 models. We organize these models in a hierarchical structure with two main branches focusing on hardware-centric and software-centric power models. Under hardware-centric approaches we start from the digital circuit level and move on to describe higher-level energy consumption models at the hardware component level, server level, data center level, and finally systems of systems level. Under the software-centric approaches we investigate power models developed for operating systems, virtual machines and software applications. This systematic approach allows us to identify multiple issues prevalent in power modeling of different levels of data center systems, including: i) few modeling efforts targeted at power consumption of the entire data center ii) many state-of-the-art power models are based on a few CPU or server metrics, and iii) the effectiveness and accuracy of these power models remain open questions. Based on these observations, we conclude the survey by describing key challenges for future research on constructing effective and accurate data center power models.

741 citations

Journal ArticleDOI
TL;DR: New non-volatile memory devices store information using different physical mechanisms from those employed in today's memories and could achieve substantial improvements in computing performance and energy efficiency.
Abstract: New non-volatile memory devices store information using different physical mechanisms from those employed in today's memories and could achieve substantial improvements in computing performance and energy efficiency.

677 citations

Journal ArticleDOI
TL;DR: In this article, the state-of-the-art of multi-level thermal management techniques for both air- and liquid-cooled data centers is reviewed. But the main focus is on the sources of inefficiencies and the improvement methods with their configuration features and performances at each level.

272 citations