scispace - formally typeset
Search or ask a question
Author

Massoud Pedram

Bio: Massoud Pedram is an academic researcher from University of Southern California. The author has contributed to research in topics: Energy consumption & CMOS. The author has an hindex of 77, co-authored 780 publications receiving 23047 citations. Previous affiliations of Massoud Pedram include University of California, Berkeley & Syracuse University.


Papers
More filters
Journal ArticleDOI
TL;DR: Experimental results demonstrate that the static switch sizing can achieve 6% power conversion efficiency enhancement, which translates to 19% reduction in power loss general usage of the smartphone, while the dynamic switch modulation accomplishes similar improvement at the same condition, while also achieving high efficiency enhancement in various load conditions.
Abstract: Smartphones consume a significant amount of power. Indeed, they can hardly provide a full day of use between charging operations even with a 2000 mAh battery. While power minimization and dynamic power management techniques have been heavily explored to improve the power efficiency of modules (processors, memory, display, GPS, etc.) inside a smartphone platform, there is one critical factor that is often overlooked: the power conversion efficiency of the power delivery network (PDN). This paper focuses on dc-dc converters, which play a pivotal role in the PDN of the smartphone platform. Starting from detailed models of the dc-dc converter designs, two optimization methods are presented: 1) static switch sizing to maximize the efficiency of a dc-dc converter under statistical loading profiles and 2) dynamic switch modulation to achieve the high efficiency enhancement under dynamically varying load conditions. To verify the efficacy of the optimization methods in actual smartphone platforms, this paper also presents a characterization procedure for the PDN. The procedure is as follows: 1) group the modules in the smartphone platform together and use profiling to estimate their average and peak power consumption levels and 2) build an equivalent dc-dc converter model for the power delivery path from the battery source to each group of modules and use linear regression to estimate the conversion efficiency of the corresponding equivalent converter. Experimental results demonstrate that the static switch sizing can achieve 6% power conversion efficiency enhancement, which translates to 19% reduction in power loss general usage of the smartphone. The dynamic switch modulation accomplishes similar improvement at the same condition, while also achieving high efficiency enhancement in various load conditions.

22 citations

Proceedings ArticleDOI
30 Sep 2012
TL;DR: A model-free reinforcement learning (RL) approach for an adaptive DPM framework in systems with bursty workloads, using a hybrid power supply comprised of Li-ion batteries and supercapacitors is presented.
Abstract: Dynamic power management (DPM) in battery-powered mobile systems attempts to achieve higher energy efficiency by selectively setting idle components to a sleep state. However, re-activating these components at a later time consumes a large amount of energy, which means that it will create a significant power draw from the battery supply in the system. This is known as the energy overhead of the “wakeup” operation. We start from the observation that, due to the rate capacity effect in Li-ion batteries which are commonly used to power mobile systems, the actual energy overhead is in fact larger than previously thought. Next we present a model-free reinforcement learning (RL) approach for an adaptive DPM framework in systems with bursty workloads, using a hybrid power supply comprised of Li-ion batteries and supercapacitors. Simulation results show that our technique enhances power efficiency by up to 9% compared to a battery-only power supply. Our RL-based DPM approach also achieves a much lower energy-delay product compared to a previously reported expert-based learning approach.

22 citations

Journal ArticleDOI
TL;DR: A physical mapping tool for quantum circuits is presented, which generates the optimal universal logic block (ULB) that can, on average, perform any logical fault-tolerant quantum operations with the minimum latency.
Abstract: This paper presents a physical mapping tool for quantum circuits, which generates the optimal universal logic block (ULB) that can, on average, perform any logical fault-tolerant (FT) quantum operations with the minimum latency. The operation scheduling, placement, and qubit routing problems tackled by the quantum physical mapper are highly dependent on one another. More precisely, the scheduling solution affects the quality of the achievable placement solution due to resource pressures that may be created as a result of operation scheduling, whereas the operation placement and qubit routing solutions influence the scheduling solution due to resulting distances between predecessor and current operations, which in turn determines routing latencies. The proposed flow for the quantum physical mapper captures these dependencies by applying (1) a loose scheduling step, which transforms an initial quantum data flow graph into one that explicitly captures the no-cloning theorem of the quantum computing and then performs instruction scheduling based on a modified force-directed scheduling approach to minimize the resource contention and quantum circuit latency, (2) a placement step, which uses timing-driven instruction placement to minimize the approximate routing latencies while making iterative calls to the aforesaid force-directed scheduler to correct scheduling levels of quantum operations as needed, and (3) a routing step that finds dynamic values of routing latencies for the qubits. In addition to the quantum physical mapper, an approach is presented to determine the single best ULB size for a target quantum circuit by examining the latency of different FT quantum operations mapped onto different ULB sizes and using information about the occurrence frequency of operations on critical paths of the target quantum algorithm to weigh these latencies. Experimental results show an average latency reduction of about 40 % compared to previous work.

22 citations

Journal ArticleDOI
TL;DR: It is advanced that the critical requirement for DSMs is high-frequency processing and not a high-oversampling ratio, and a single-bit semiparallel processing structure to accomplish the high- frequencies processing is proposed in this paper.
Abstract: The oversampling requirement in a delta-sigma modulator (DSM) is considered one of the limiting factors toward its employment in current high-frequency applications, such as wireless software defined radio (SDR) systems. This paper advances that the critical requirement for DSMs is high-frequency processing and not a high-oversampling ratio. A single-bit semiparallel processing structure to accomplish the high-frequency processing is proposed in this paper. Using the suggested low-oversampling digital DSM architecture, high-speed, high-complexity computations, which are normally required for wireless applications, are executed simultaneously. This facilitates the design of embedded SDR multistandard transmitters using commercially available digital processors. The most favorable application of the proposed single-bit DSM is to build an radio frequency transmitter that includes a one-bit quantifier with two-level switching power amplifier for both high linearity and high efficiency. Performance analysis is carried out by using MATLAB simulations, which shows a reduction of the oversampling ratio by a factor of 16 (for a baseline oversampling ratio of 256) with the same signal-to-noise ratio (SNR). The proposed DSM is also implemented on a field-programmable gate array (FPGA) board and its performance is validated by using a code division multiple access signal. The bandwidth of the output signal is increased four times without increasing the processing frequency. Simultaneously, quality of the output signal remains the same but FPGA resource usage is increased by a factor of three.

22 citations

Proceedings ArticleDOI
20 Jul 2020
TL;DR: This work proposes a hardware-aware pruning approach that can fully adapt to the loop tiling technique of FPGA design and is applied onto a novel 3D network called R(2+1)D.
Abstract: There have been many recent attempts to extend the successes of convolutional neural networks (CNNs) from 2-dimensional (2D) image classification to 3-dimensional (3D) video recognition by exploring 3D CNNs. Considering the emerging growth of mobile or Internet of Things (IoT) market, it is essential to investigate the deployment of 3D CNNs on edge devices. Previous works have implemented standard 3D CNNs (C3D) on hardware platforms, however, they have not exploited model compression for acceleration of inference. This work proposes a hardware-aware pruning approach that can fully adapt to the loop tiling technique of FPGA design and is applied onto a novel 3D network called R(2+1)D. Leveraging the powerful ADMM, the proposed pruning method achieves simultaneous high accuracy and significant acceleration of computation on FPGA. With layer-wise pruning rates up to 10× and negligible accuracy loss, the pruned model is implemented on a Xilinx ZCU102 FPGA board, where the pruned model achieves 2.6× speedup compared with the unpruned version, and 2.3× speedup and 2.3× power efficiency improvement compared with state-of-the-art FPGA implementation of C3D.

22 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations