Other affiliations: Magma Design Automation
Bio: Shahin Nazarian is an academic researcher from University of Southern California. The author has contributed to research in topic(s): Logic gate & Smart grid. The author has an hindex of 18, co-authored 121 publication(s) receiving 1420 citation(s). Previous affiliations of Shahin Nazarian include Magma Design Automation.
Topics: Logic gate, Smart grid, Electronic circuit, CMOS, Lookup table
Papers published on a yearly basis
••25 Sep 2006
TL;DR: A brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power V LSI circuits is presented.
Abstract: The growing packing density and power consumption of very large scale integration (VLSI) circuits have made thermal effects one of the most important concerns of VLSI designers The increasing variability of key process parameters in nanometer CMOS technologies has resulted in larger impact of the substrate and metal line temperatures on the reliability and performance of the devices and interconnections Recent data shows that more than 50% of all integrated circuit failures are related to thermal issues This paper presents a brief discussion of key sources of power dissipation and their temperature relation in CMOS VLSI circuits, and techniques for full-chip temperature calculation with special attention to its implications on the design of high-performance, low-power VLSI circuits The paper is concluded with an overview of techniques to improve the full-chip thermal integrity by means of off-chip versus on-chip and static versus adaptive methods
22 Jan 2018
TL;DR: DRL-Cloud is presented, a novel Deep Reinforcement Learning (DRL)-based RP and TS system, to minimize energy cost for large-scale CSPs with very large number of servers that receive enormous numbers of user requests per day.
Abstract: Cloud computing has become an attractive computing paradigm in both academia and industry. Through virtualization technology, Cloud Service Providers (CSPs) that own data centers can structure physical servers into Virtual Machines (VMs) to provide services, resources, and infrastructures to users. Profit-driven CSPs charge users for service access and VM rental, and reduce power consumption and electric bills so as to increase profit margin. The key challenge faced by CSPs is data center energy cost minimization. Prior works proposed various algorithms to reduce energy cost through Resource Provisioning (RP) and/or Task Scheduling (TS). However, they have scalability issues or do not consider TS with task dependencies, which is a crucial factor that ensures correct parallel execution of tasks. This paper presents DRL-Cloud, a novel Deep Reinforcement Learning (DRL)-based RP and TS system, to minimize energy cost for large-scale CSPs with very large number of servers that receive enormous numbers of user requests per day. A deep Q-learning-based two-stage RP-TS processor is designed to automatically generate the best long-term decisions by learning from the changing environment such as user request patterns and realistic electric price. With training techniques such as target network, experience replay, and exploration and exploitation, the proposed DRL-Cloud achieves remarkably high energy cost efficiency, low reject rate as well as low runtime with fast convergence. Compared with one of the state-of-the-art energy efficient algorithms, the proposed DRL-Cloud achieves up to 320% energy cost efficiency improvement while maintaining lower reject rate on average. For an example CSP setup with 5,000 servers and 200,000 tasks, compared to a fast round-robin baseline, the proposed DRL-Cloud achieves up to 144% runtime reduction.
••24 Jul 2006
TL;DR: A new current-based cell delay model is utilized, which can accurately compute the output waveform for input waveforms of arbitrary shapes subjected to noise, and the cell parasitic capacitances are pre-characterized by lookup tables to improve the accuracy.
Abstract: A statistical model for the purpose of logic cell timing analysis in the presence of process variations is presented. A new current-based cell delay model is utilized, which can accurately compute the output waveform for input waveforms of arbitrary shapes subjected to noise. The cell parasitic capacitances are pre-characterized by lookup tables to improve the accuracy. To capture the effect of process parameter variations on the cell behavior, the output voltage waveform of logic cells is modeled by a stochastic Markovian process in which the voltage value probability distribution at each time instance is computed from that of the previous time instance. Next the probability distribution of a % V/sub dd/ crossing time, i.e., the hitting time of the output voltage stochastic process is computed. Experimental results demonstrate the high accuracy of our cell delay model compared to Monte-Carlo-based SPICE simulations.
TL;DR: A self-optimizing and self-programming computing system (SOSPCS) design framework that achieves both programmability and flexibility and exploits computing heterogeneity and concludes that SOSPCS provides performance improvement and energy reduction compared to the state-of-the-art approaches.
Abstract: There exists an urgent need for determining the right amount and type of specialization while making a heterogeneous system as programmable and flexible as possible. Therefore, in this paper, we pioneer a self-optimizing and self-programming computing system (SOSPCS) design framework that achieves both programmability and flexibility and exploits computing heterogeneity [e.g., CPUs, GPUs, and hardware accelerators (HWAs)]. First, at compile time, we form a task pool consisting of hybrid tasks with different processing element (PE) affinities according to target applications. Tasks preferred to be executed on GPUs or accelerators are detected from target applications by neural networks. Tasks suitable to run on CPUs are formed by community detection to minimize data movement overhead. Next, a distributed reinforcement learning-based approach is used at runtime to allow agents to map the tasks onto the network-on-chip-based heterogeneous PEs by learning an optimal policy based on $Q$ values in the environment. We have conducted experiments on a heterogeneous platform consisting of CPUs, GPUs, and HWAs with deep learning algorithms such as matrix multiplication, ReLU, and sigmoid functions. We concluded that SOSPCS provides performance improvement up to $4.12\times $ and energy reduction up to $3.24\times $ compared to the state-of-the-art approaches.
11 Aug 2014
TL;DR: Experimental results demonstrate 40% energy saving can be achieved by the proposed TEI-aware DTM approach compared to the best-in-class DTMs that are unaware of this phenomenon.
Abstract: Due to limits on the availability of the energy source in many mobile user platforms (ranging from handheld devices to portable electronics to deeply embedded devices) and concerns about how much heat can effectively be removed from chips, minimizing the power consumption has become a primary driver for system-on-chip designers. Because of their superb characteristics, FinFETs have emerged as a promising replacement for planar CMOS devices in sub-20nm CMOS technology nodes. However, based on extensive simulations, we have observed that the delay vs. temperature characteristics of FinFET-based circuits are fundamentally different from that of the conventional bulk CMOS circuits, i.e., the delay of a FinFET circuit decreases with increasing temperature even in the super-threshold supply voltage regime. Unfortunately, the leakage power dissipation of the FinFET-based circuits increases exponentially with the temperature. These two trends give rise to a tradeoff between delay and leakage power as a function of the chip temperature, and hence, lead to the definition of an optimum chip temperature operating point (i.e., one that balances concerns about the circuit speed and power efficiency.) This paper presents the results of our investigations into the aforesaid temperature effect inversion (TEI) and proposes a novel dynamic thermal management (DTM) algorithm, which exploits this phenomenon to minimize the energy consumption of FinFET-based circuits without any appreciable performance penalty. Experimental results demonstrate 40% energy saving (with no performance penalty) can be achieved by the proposed TEI-aware DTM approach compared to the best-in-class DTMs that are unaware of this phenomenon.
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …
University of Illinois at Urbana–Champaign1, Massachusetts Institute of Technology2, Harvard University3, Stanford University4, Rensselaer Polytechnic Institute5, Pennsylvania State University6, University of California, Berkeley7, Brown University8, University of Florida9, University of Texas at Austin10
14 Jan 2014-Applied physics reviews
TL;DR: In this article, a review of thermal transport at the nanoscale is presented, emphasizing developments in experiment, theory, and computation in the past ten years and summarizes the present status of the field.
Abstract: A diverse spectrum of technology drivers such as improved thermal barriers, higher efficiency thermoelectric energy conversion, phase-change memory, heat-assisted magnetic recording, thermal management of nanoscale electronics, and nanoparticles for thermal medical therapies are motivating studies of the applied physics of thermal transport at the nanoscale. This review emphasizes developments in experiment, theory, and computation in the past ten years and summarizes the present status of the field. Interfaces become increasingly important on small length scales. Research during the past decade has extended studies of interfaces between simple metals and inorganic crystals to interfaces with molecular materials and liquids with systematic control of interface chemistry and physics. At separations on the order of ∼1 nm, the science of radiative transport through nanoscale gaps overlaps with thermal conduction by the coupling of electronic and vibrational excitations across weakly bonded or rough interface...
01 Jan 2022
TL;DR: This paper provides a comprehensive review of various DR schemes and programs, based on the motivations offered to the consumers to participate in the program, and presents various optimization models for the optimal control of the DR strategies that have been proposed so far.
Abstract: The smart grid concept continues to evolve and various methods have been developed to enhance the energy efficiency of the electricity infrastructure. Demand Response (DR) is considered as the most cost-effective and reliable solution for the smoothing of the demand curve, when the system is under stress. DR refers to a procedure that is applied to motivate changes in the customers' power consumption habits, in response to incentives regarding the electricity prices. In this paper, we provide a comprehensive review of various DR schemes and programs, based on the motivations offered to the consumers to participate in the program. We classify the proposed DR schemes according to their control mechanism, to the motivations offered to reduce the power consumption and to the DR decision variable. We also present various optimization models for the optimal control of the DR strategies that have been proposed so far. These models are also categorized, based on the target of the optimization procedure. The key aspects that should be considered in the optimization problem are the system's constraints and the computational complexity of the applied optimization algorithm.