scispace - formally typeset
Search or ask a question
Author

Ayse K. Coskun

Bio: Ayse K. Coskun is an academic researcher from Boston University. The author has contributed to research in topics: Efficient energy use & Energy consumption. The author has an hindex of 30, co-authored 152 publications receiving 3443 citations. Previous affiliations of Ayse K. Coskun include Complutense University of Madrid & Sun Microsystems.


Papers
More filters
Proceedings ArticleDOI
03 Dec 2011
TL;DR: Pack & Cap is proposed, a control technique designed to make optimal DVFS and thread packing control decisions in order to maximize performance within a power budget and is implemented and validated on a real quad-core system running the PARSEC parallel benchmark suite.
Abstract: The ability to cap peak power consumption is a desirable feature in modern data centers for energy budgeting, cost management, and efficient power delivery. Dynamic voltage and frequency scaling (DVFS) is a traditional control knob in the tradeoff between server power and performance. Multi-core processors and the parallel applications that take advantage of them introduce new possibilities for control, wherein workload threads are packed onto a variable number of cores and idle cores enter low-power sleep states. This paper proposes Pack & Cap, a control technique designed to make optimal DVFS and thread packing control decisions in order to maximize performance within a power budget. In order to capture the workload dependence of the performance-power Pareto frontier, a multinomial logistic regression (MLR) classifier is built using a large volume of performance counter, temperature, and power characterization data. When queried during runtime, the classifier is capable of accurately selecting the optimal operating point. We implement and validate this method on a real quad-core system running the PARSEC parallel benchmark suite. When varying the power budget during runtime, Pack & Cap meets power constraints 82% of the time even in the absence of a power measuring device. The addition of thread packing to DVFS as a control knob increases the range of feasible power constraints by an average of 21% when compared to DVFS alone and reduces workload energy consumption by an average of 51.6% compared to existing control techniques that achieve the same power range.

262 citations

Proceedings ArticleDOI
16 Apr 2007
TL;DR: This work design and evaluate OS-level dynamic scheduling policies with negligible performance overhead, and shows that, using simple to implement policies that make decisions based on temperature measurements, better temporal and spatial thermal profiles can be achieved in comparison to state-of-art schedulers.
Abstract: In deep submicron circuits, elevation in temperatures has brought new challenges in reliability, timing, performance, cooling costs and leakage power. Conventional thermal management techniques sacrifice performance to control the thermal behavior by slowing down or turning off the processors when a critical temperature threshold is exceeded. Moreover, studies have shown that in addition to high temperatures, temporal and spatial variations in temperature impact system reliability. In this work, we explore the benefits of thermally aware task scheduling for multiprocessor systems-on-a-chip (MPSoC). We design and evaluate OS-level dynamic scheduling policies with negligible performance overhead. We show that, using simple to implement policies that make decisions based on temperature measurements, better temporal and spatial thermal profiles can be achieved in comparison to state-of-art schedulers. We also enhance reactive strategies such as dynamic thread migration with our scheduling policies. This way, hot spots and temperature variations are decreased, and the performance cost is significantly reduced.

240 citations

Proceedings ArticleDOI
20 Apr 2009
TL;DR: This work first investigates how the existing thermal management, power management and job scheduling policies affect the thermal behavior in 3D chips, and proposes a dynamic thermally-aware job scheduling technique for 3D systems to reduce the thermal problems at very low performance cost.
Abstract: Technology scaling has caused the feature sizes to shrink continuously, whereas interconnects, unlike transistors, have not followed the same trend. Designing 3D stack architectures is a recently proposed approach to overcome the power consumption and delay problems associated with the interconnects by reducing the length of the wires going across the chip. However, 3D integration introduces serious thermal challenges due to the high power density resulting from placing computational units on top of each other. In this work, we first investigate how the existing thermal management, power management and job scheduling policies affect the thermal behavior in 3D chips. We then propose a dynamic thermally-aware job scheduling technique for 3D systems to reduce the thermal problems at very low performance cost. Our approach can also be integrated with power management policies to reduce energy consumption while avoiding the thermal hot spots and large temperature variations.

191 citations

Journal ArticleDOI
TL;DR: In this paper, the authors explore the benefits of temperature-aware task scheduling for multiprocessor system-on-a-chip (MPSoC) and evaluate their techniques using workload characteristics collected from a real system by Sun's Continuous System Telemetry.
Abstract: Thermal hot spots and high temperature gradients degrade reliability and performance, and increase cooling costs and leakage power. In this paper, we explore the benefits of temperature-aware task scheduling for multiprocessor system-on-a-chip (MPSoC). We evaluate our techniques using workload characteristics collected from a real system by Sun's Continuous System Telemetry. We first solve the task scheduling problem statically using integer linear programming (ILP). The ILP solution is guaranteed to be optimal for the given assumptions for tasks. We formulate ILPs for minimizing energy, balancing energy, and reducing hot spots, and provide an extensive comparison of their thermal behavior against our technique. Our static solution can reduce the frequency of hot spots by 35%, spatial gradients by 85%, and thermal cycles by 61% in comparison to the ILP for minimizing energy. We then design dynamic scheduling policies at the OS-level with negligible performance overhead. Our adaptive dynamic policy reduces the frequency of high-magnitude thermal cycles and spatial gradients by around 50% and 90%, respectively, in comparison to state-of-the-art schedulers. Reactive thermal management strategies, such as thread migration, can be combined with our scheduling policy to further reduce hot spots, temperature variations, and the associated performance cost.

157 citations

Journal ArticleDOI
TL;DR: This paper investigates how to use predictors for forecasting temperature and workload dynamics, and proposes proactive thermal management techniques for multiprocessor system-on-chips.
Abstract: Conventional thermal management techniques are reactive, as they take action after temperature reaches a threshold. Such approaches do not always minimize and balance the temperature, and they control temperature at a noticeable performance cost. This paper investigates how to use predictors for forecasting temperature and workload dynamics, and proposes proactive thermal management techniques for multiprocessor system-on-chips. The predictors we study include autoregressive moving average modeling and lookup tables. We evaluate several reactive and predictive techniques on an UltraSPARC T1 processor and an architecture-level simulator. Proactive methods achieve significantly better thermal profiles and performance in comparison to reactive policies.

140 citations


Cited by
More filters
01 May 1993
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

29,323 citations

01 Jan 2015

976 citations

Journal ArticleDOI
TL;DR: An in-depth study of the existing literature on data center power modeling, covering more than 200 models, organized in a hierarchical structure with two main branches focusing on hardware-centric and software-centric power models.
Abstract: Data centers are critical, energy-hungry infrastructures that run large-scale Internet-based services. Energy consumption models are pivotal in designing and optimizing energy-efficient operations to curb excessive energy consumption in data centers. In this paper, we survey the state-of-the-art techniques used for energy consumption modeling and prediction for data centers and their components. We conduct an in-depth study of the existing literature on data center power modeling, covering more than 200 models. We organize these models in a hierarchical structure with two main branches focusing on hardware-centric and software-centric power models. Under hardware-centric approaches we start from the digital circuit level and move on to describe higher-level energy consumption models at the hardware component level, server level, data center level, and finally systems of systems level. Under the software-centric approaches we investigate power models developed for operating systems, virtual machines and software applications. This systematic approach allows us to identify multiple issues prevalent in power modeling of different levels of data center systems, including: i) few modeling efforts targeted at power consumption of the entire data center ii) many state-of-the-art power models are based on a few CPU or server metrics, and iii) the effectiveness and accuracy of these power models remain open questions. Based on these observations, we conclude the survey by describing key challenges for future research on constructing effective and accurate data center power models.

741 citations

Journal ArticleDOI
TL;DR: This paper provides a general description of NoC architectures and applications and enumerates several related research problems organized under five main categories: Application characterization, communication paradigm, communication infrastructure, analysis, and solution evaluation.
Abstract: To alleviate the complex communication problems that arise as the number of on-chip components increases, network-on-chip (NoC) architectures have been recently proposed to replace global interconnects. In this paper, we first provide a general description of NoC architectures and applications. Then, we enumerate several related research problems organized under five main categories: Application characterization, communication paradigm, communication infrastructure, analysis, and solution evaluation. Motivation, problem description, proposed approaches, and open issues are discussed for each problem from system, microarchitecture, and circuit perspectives. Finally, we address the interactions among these research problems and put the NoC design process into perspective.

733 citations