scispace - formally typeset
Search or ask a question
Author

David Irwin

Other affiliations: Duke University
Bio: David Irwin is an academic researcher from University of Massachusetts Amherst. The author has contributed to research in topics: Cloud computing & Smart grid. The author has an hindex of 36, co-authored 149 publications receiving 5750 citations. Previous affiliations of David Irwin include Duke University.


Papers
More filters
Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work defines a knee formally for continuous functions using the mathematical concept of curvature and compares its definition against alternatives, and evaluates Kneedle's accuracy against existing algorithms on both synthetic and real data sets and its performance in two different applications.
Abstract: Computer systems often reach a point at which the relative cost to increase some tunable parameter is no longer worth the corresponding performance benefit. These ``knees'' typically represent beneficial points that system designers have long selected to best balance inherent trade-offs. While prior work largely uses ad hoc, system-specific approaches to detect knees, we present Kneedle, a general approach to on line and off line knee detection that is applicable to a wide range of systems. We define a knee formally for continuous functions using the mathematical concept of curvature and compare our definition against alternatives. We then evaluate Kneedle's accuracy against existing algorithms on both synthetic and real data sets, and evaluate its performance in two different applications.

689 citations

Proceedings ArticleDOI
02 Nov 2010
TL;DR: It is shown that even without a priori knowledge of household activities or prior training, it is possible to extract complex usage patterns from smart meter data using off-the-shelf statistical methods.
Abstract: Household smart meters that measure power consumption in real-time at fine granularities are the foundation of a future smart electricity grid. However, the widespread deployment of smart meters has serious privacy implications since they inadvertently leak detailed information about household activities. In this paper, we show that even without a priori knowledge of household activities or prior training, it is possible to extract complex usage patterns from smart meter data using off-the-shelf statistical methods. Our analysis uses two months of data from three homes, which we instrumented to log aggregate household power consumption every second. With the data from our small-scale deployment, we demonstrate the potential for power consumption patterns to reveal a range of information, such as how many people are in the home, sleeping routines, eating routines, etc. We then sketch out the design of a privacy-enhancing smart meter architecture that allows an electric utility to achieve its net metering goals without compromising the privacy of its customers.

550 citations

Journal ArticleDOI
01 May 2006
TL;DR: This paper proposes power efficiencies at a larger scale by leveraging statistical properties of concurrent resource usage across a collection of systems ("ensemble") by discussing an implementation of this approach at the blade enclosure level to monitor and manage the power across the individual blades in a chassis.
Abstract: One of the key challenges for high-density servers (e.g., blades) is the increased costs in addressing the power and heat density associated with compaction. Prior approaches have mainly focused on reducing the heat generated at the level of an individual server. In contrast, this work proposes power efficiencies at a larger scale by leveraging statistical properties of concurrent resource usage across a collection of systems ("ensemble"). Specifically, we discuss an implementation of this approach at the blade enclosure level to monitor and manage the power across the individual blades in a chassis. Our approach requires low-cost hardware modifications and relatively simple software support. We evaluate our architecture through both prototyping and simulation. For workloads representing 132 servers from nine different enterprise deployments, we show significant power budget reductions at performances comparable to conventional systems.

421 citations

Proceedings ArticleDOI
15 Dec 2011
TL;DR: This paper explores automatically creating site-specific prediction models for solar power generation from National Weather Service weather forecasts using machine learning techniques, and shows that SVM-based prediction models built using seven distinct weather forecast metrics are 27% more accurate for the authors' site than existing forecast-based models.
Abstract: A key goal of smart grid initiatives is significantly increasing the fraction of grid energy contributed by renewables. One challenge with integrating renewables into the grid is that their power generation is intermittent and uncontrollable. Thus, predicting future renewable generation is important, since the grid must dispatch generators to satisfy demand as generation varies. While manually developing sophisticated prediction models may be feasible for large-scale solar farms, developing them for distributed generation at millions of homes throughout the grid is a challenging problem. To address the problem, in this paper, we explore automatically creating site-specific prediction models for solar power generation from National Weather Service (NWS) weather forecasts using machine learning techniques. We compare multiple regression techniques for generating prediction models, including linear least squares and support vector machines using multiple kernel functions. We evaluate the accuracy of each model using historical NWS forecasts and solar intensity readings from a weather station deployment for nearly a year. Our results show that SVM-based prediction models built using seven distinct weather forecast metrics are 27% more accurate for our site than existing forecast-based models.

410 citations

Proceedings ArticleDOI
Jeffrey S. Chase1, David Irwin1, Laura Grit1, Justin D. Moore1, Sara Sprenkle1 
22 Jun 2003
TL;DR: New mechanisms for dynamic resource management in a cluster manager called Cluster-on-Demand (COD) that support dynamic, policy-based cluster sharing between local users and hosted Grid services, resource reservation and adaptive provisioning, scavenging of the idle resources, and dynamic instantiation of Grid services are presented.
Abstract: This paper presents new mechanisms for dynamic resource management in a cluster manager called Cluster-on-Demand (COD). COD allocates servers from a common pool to multiple virtual clusters (vclusters), with independently configured software environments, name spaces, user access controls, and network storage volumes. We present experiments using the popular Sun GridEngine batch scheduler to demonstrate that dynamic virtual clusters are an enabling abstraction for advanced resource management in computing utilities and grids. In particular, they support dynamic, policy-based cluster sharing between local users and hosted Grid services, resource reservation and adaptive provisioning, scavenging of the idle resources, and dynamic instantiation of Grid services. These goals are achieved in a direct and general way through a new set of fundamental cluster management functions, with minimal impact on the Grid middleware itself.

407 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This paper defines Cloud computing and provides the architecture for creating Clouds with market-oriented resource allocation by leveraging technologies such as Virtual Machines (VMs), and provides insights on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain Service Level Agreement (SLA) oriented resource allocation.

5,850 citations

Proceedings ArticleDOI
09 Jun 2007
TL;DR: This paper presents the aggregate power usage characteristics of large collections of servers for different classes of applications over a period of approximately six months, and uses the modelling framework to estimate the potential of power management schemes to reduce peak power and energy usage.
Abstract: Large-scale Internet services require a computing infrastructure that can beappropriately described as a warehouse-sized computing system. The cost ofbuilding datacenter facilities capable of delivering a given power capacity tosuch a computer can rival the recurring energy consumption costs themselves.Therefore, there are strong economic incentives to operate facilities as closeas possible to maximum capacity, so that the non-recurring facility costs canbe best amortized. That is difficult to achieve in practice because ofuncertainties in equipment power ratings and because power consumption tends tovary significantly with the actual computing activity. Effective powerprovisioning strategies are needed to determine how much computing equipmentcan be safely and efficiently hosted within a given power budget.In this paper we present the aggregate power usage characteristics of largecollections of servers (up to 15 thousand) for different classes ofapplications over a period of approximately six months. Those observationsallow us to evaluate opportunities for maximizing the use of the deployed powercapacity of datacenters, and assess the risks of over-subscribing it. We findthat even in well-tuned applications there is a noticeable gap (7 - 16%)between achieved and theoretical aggregate peak power usage at the clusterlevel (thousands of servers). The gap grows to almost 40% in wholedatacenters. This headroom can be used to deploy additional compute equipmentwithin the same power budget with minimal risk of exceeding it. We use ourmodeling framework to estimate the potential of power management schemes toreduce peak power and energy usage. We find that the opportunities for powerand energy savings are significant, but greater at the cluster-level (thousandsof servers) than at the rack-level (tens). Finally we argue that systems needto be power efficient across the activity range, and not only at peakperformance levels.

2,047 citations

Book ChapterDOI
04 Oct 2019
TL;DR: Permission to copy without fee all or part of this material is granted provided that the copies arc not made or distributed for direct commercial advantage.
Abstract: Usually, a proof of a theorem contains more knowledge than the mere fact that the theorem is true. For instance, to prove that a graph is Hamiltonian it suffices to exhibit a Hamiltonian tour in it; however, this seems to contain more knowledge than the single bit Hamiltonian/non-Hamiltonian.In this paper a computational complexity theory of the “knowledge” contained in a proof is developed. Zero-knowledge proofs are defined as those proofs that convey no additional knowledge other than the correctness of the proposition in question. Examples of zero-knowledge proof systems are given for the languages of quadratic residuosity and 'quadratic nonresiduosity. These are the first examples of zero-knowledge proofs for languages not known to be efficiently recognizable.

1,962 citations

Proceedings ArticleDOI
18 May 2009
TL;DR: This work presents Eucalyptus -- an open-source software framework for cloud computing that implements what is commonly referred to as Infrastructure as a Service (IaaS); systems that give users the ability to run and control entire virtual machine instances deployed across a variety physical resources.
Abstract: Cloud computing systems fundamentally provide access to large pools of data and computational resources through a variety of interfaces similar in spirit to existing grid and HPC resource management and programming systems. These types of systems offer a new programming target for scalable application developers and have gained popularity over the past few years. However, most cloud computing systems in operation today are proprietary, rely upon infrastructure that is invisible to the research community, or are not explicitly designed to be instrumented and modified by systems researchers. In this work, we present Eucalyptus -- an open-source software framework for cloud computing that implements what is commonly referred to as Infrastructure as a Service (IaaS); systems that give users the ability to run and control entire virtual machine instances deployed across a variety physical resources. We outline the basic principles of the Eucalyptus design, detail important operational aspects of the system, and discuss architectural trade-offs that we have made in order to allow Eucalyptus to be portable, modular and simple to use on infrastructure commonly found within academic settings. Finally, we provide evidence that Eucalyptus enables users familiar with existing Grid and HPC systems to explore new cloud computing functionality while maintaining access to existing, familiar application development software and Grid middle-ware.

1,962 citations