Showing papers by "David Irwin published in 2016"

PDF

Open Access

Proceedings Article•DOI•

Flint: batch-interactive data-intensive processing on transient servers

[...]

Prateek Sharma¹, Tian Guo¹, Xin He¹, David Irwin¹, Prashant Shenoy¹ - Show less +1 more•Institutions (1)

18 Apr 2016

TL;DR: Flint is designed, which is based on Spark and includes automated checkpointing and server selection policies that support batch and interactive applications and dynamically adapt to application characteristics, and yields cost savings of up to 90% compared to using on-demand servers.

...read moreread less

Abstract: Cloud providers now offer transient servers, which they may revoke at anytime, for significantly lower prices than on-demand servers, which they cannot revoke. The low price of transient servers is particularly attractive for executing an emerging class of workload, which we call Batch-Interactive Data-Intensive (BIDI), that is becoming increasingly important for data analytics. BIDI workloads require large sets of servers to cache massive datasets in memory to enable low latency operation. In this paper, we illustrate the challenges of executing BIDI workloads on transient servers, where revocations (akin to failures) are the common case. To address these challenges, we design Flint, which is based on Spark and includes automated checkpointing and server selection policies that i) support batch and interactive applications and ii) dynamically adapt to application characteristics. We evaluate a prototype of Flint using EC2 spot instances, and show that it yields cost savings of up to 90% compared to using on-demand servers, while increasing running time by

...read moreread less

91 citations

Proceedings Article•

How not to bid the cloud

[...]

Prateek Sharma¹, David Irwin¹, Prashant Shenoy¹•Institutions (1)

University of Massachusetts Amherst¹

20 Jun 2016

TL;DR: It is argued that sophisticated bidding strategies, in practice, do not provide any advantages over simple strategies for multiple reasons.

...read moreread less

Abstract: Cloud providers have begun to allow users to bid for surplus servers on a spot market. These servers are allocated if a user's bid price is higher than their market price and revoked otherwise. Thus, analyzing price data to derive optimal bidding strategies has become a popular research topic. In this paper, we argue that sophisticated bidding strategies, in practice, do not provide any advantages over simple strategies for multiple reasons. First, due to price characteristics, there are a wide range of bid prices that yield the optimal cost and availability. Second, given the large number of spot markets, there is always a market with available surplus resources. Thus, if resources become unavailable due to a price spike, users need not wait until the spike subsides, but can instead provision a new spot resource elsewhere and migrate to it. Third, current spot market rules enable users to place maximum bids for resources without any penalty. Given bidding's irrelevance, users can adopt trivial bidding strategies and focus instead on modifying applications to efficiently seek out and migrate to the lowest cost resources.

...read moreread less

45 citations

Proceedings Article•DOI•

SunSpot: Exposing the Location of Anonymous Solar-powered Homes

[...]

Dong Chen¹, Srinivasan Iyengar¹, David Irwin¹, Prashant Shenoy¹•Institutions (1)

University of Massachusetts Amherst¹

16 Nov 2016

TL;DR: SunSpot is able to localize a solar-powered home to a small region of interest that is near the smallest possible area given the energy data resolution, e.g., within a ~500m and ~28km radius for per-second and per-minute resolution, respectively.

...read moreread less

Abstract: Homeowners are increasingly deploying grid-tied solar systems due to the rapid decline in solar module prices. The energy produced by these solar-powered homes is monitored by utilities and third parties using networked energy meters, which record and transmit energy data at fine-grained intervals. Such energy data is considered anonymous if it is not associated with identifying account information, e.g., a name and address. Thus, energy data from these "anonymous" homes is often not handled securely: it is routinely transmitted over the Internet in plaintext, stored unencrypted in the cloud, shared with third-party energy analytics companies, and even made publicly available over the Internet. Extensive prior work has shown that energy consumption data is vulnerable to multiple attacks, which analyze it to reveal a range of sensitive private information about occupant activities. However, these attacks are useless without knowledge of a home's location. Our key insight is that solar energy data is not anonymous: since every location on Earth has a unique solar signature, it embeds detailed location information. To explore the severity and extent of this privacy threat, we design SunSpot to localize "anonymous" solar-powered homes using their solar energy data. We evaluate SunSpot on publicly-available energy data from 14 homes with rooftop solar. We find that SunSpot is able to localize a solar-powered home to a small region of interest that is near the smallest possible area given the energy data resolution, e.g., within a ~500m and ~28km radius for per-second and per-minute resolution, respectively. SunSpot then identifies solar-powered homes within this region using crowd-sourced image processing of satellite data before applying additional filters to identify a specific home.

...read moreread less

37 citations

Proceedings Article•DOI•

SpotLight: An Information Service for the Cloud

[...]

Xue Ouyang¹, David Irwin¹, Prashant Shenoy¹•Institutions (1)

University of Massachusetts Amherst¹

27 Jun 2016

TL;DR: This work incorporates market-based probing into SpotLight, an information service that enables cloud applications to query this and other data, and uses it to monitor the availability of more than 4500 distinct server types across 9 geographical regions in Amazon's Elastic Compute Cloud over a 3 month period.

...read moreread less

Abstract: Infrastructure-as-a-Service cloud platforms are incredibly complex: they rent hundreds of different types of servers across multiple geographical regions under a wide range of contract types that offer varying tradeoffs between risk and cost. Unfortunately, the internal dynamics of cloud platforms are opaque along several dimensions. For example, while the risk of servers not being available when requested is critical in optimizing the cloud's risk-cost tradeoffs, it is not typically made visible to users. Thus, inspired by prior work on Internet bandwidth probing, we propose actively probing cloud platforms to explicitly learn such information, where each "probe" is a request for a particular type of server. We model the relationships between different contracts types to develop a market-based probing policy, which leverages the insight that real-time prices in cloud spot markets loosely correlate with the supply (and availability) of fixed-price on-demand servers. That is, the higher the spot price for a server, the more likely the corresponding fixed-price on-demand server is not available. We incorporate market-based probing into SpotLight, an information service that enables cloud applications to query this and other data, and use it to monitor the availability of more than 4500 distinct server types across 9 geographical regions in Amazon's Elastic Compute Cloud over a 3 month period. We analyze this data to reveal interesting observations about the platform's internal dynamics. We then show how SpotLight enables two recently proposed derivative cloud services to select a better mix of servers to host applications, which improves their availability from ~70-90% to near 100% in practice.

...read moreread less

33 citations

Proceedings Article•DOI•

Transient guarantees: maximizing the value of idle cloud capacity

[...]

Supreeth Shastri¹, Amr Rizk¹, David Irwin¹•Institutions (1)

University of Massachusetts Amherst¹

13 Nov 2016

TL;DR: This work presents policies for partitioning a variable amount of idle capacity into classes with different transient guarantees to maximize performance and value, and shows that this approach can increase the aggregate revenue from idle server capacity by up to ∼6.5× compared to existing approaches.

...read moreread less

Abstract: To prevent rejecting requests, cloud platforms typically provision for their peak demand. Thus, a platform's idle capacity can be significant, as demand varies widely over multiple time scales, e.g., daily and seasonally. To reduce waste, platforms have begun to offer this idle capacity in the form of transient servers, which they may unilaterally revoke, for much lower prices---~50-90% less---than on-demand servers, which they cannot revoke. However, transient servers' revocation characteristics---their volatility and predictability---influence their performance, since they affect the overhead of fault-tolerance mechanisms applications use to handle revocations. Unfortunately, current cloud platforms offer no guarantees on revocation characteristics, which makes it difficult for users to optimally configure (and correctly value) transient servers. To address the problem, we propose the abstraction of a transient guarantee, which offers probabilistic assurances on revocation characteristics. Transient guarantees have numerous benefits: they increase the performance of transient servers, enable users to optimally use and correctly value them, and permit platforms to control their freedom to revoke them. We present policies for partitioning a variable amount of idle capacity into classes with different transient guarantees to maximize performance and value. We then implement and evaluate these policies on job traces from a production Google cluster. We show that our approach can increase the aggregate revenue from idle server capacity by up to ~6.5X compared to existing approaches.

...read moreread less

33 citations

Proceedings Article•DOI•

Shared solar-powered EV charging stations: Feasibility and benefits

[...]

Stephen Lee¹, Srinivasan Iyengar¹, David Irwin¹, Prashant Shenoy¹•Institutions (1)

University of Massachusetts Amherst¹

01 Jan 2016

TL;DR: This paper designs a solar-powered EV charging station in a parking lot of a car-share service and forms a Linear Programming approach to charge EVs that maximize the utilization of solar energy while maintaining similar battery levels for all cars.

...read moreread less

Abstract: Electric vehicles (EV) are growing in popularity as a credible alternative to gas-powered vehicles. These vehicles require their batteries to be “fueled up” for operation. While EV charging has traditionally been grid-based, use of solar powered chargers has emerged as an interesting opportunity. These chargers provide clean electricity to electric-powered cars that are themselves pollution free resulting in positive environmental effects. In this paper, we design a solar-powered EV charging station in a parking lot of a car-share service. In such a car-share service rental pick up and drop off times are known. We formulate a Linear Programming approach to charge EVs that maximize the utilization of solar energy while maintaining similar battery levels for all cars. We evaluate the performance of our algorithm on a real-world and synthetically derived datasets to show that it fairly distributes the available electric charge among candidate EVs across seasons with variable demand profiles. Further, we reduce the disparity in the battery charge levels by 60% compared to best effort charging policy. Moreover, we show that 80th percentile of EVs have at least 75% battery level at the end of their charging session. Finally, we demonstrate the feasibility of our charging station and show that a solar installation proportional to the size of a parking lot adequately apportions available solar energy generated to the EVs serviced.

...read moreread less

32 citations

Proceedings Article•DOI•

Analyzing Energy Usage on a City-scale using Utility Smart Meters

[...]

Srinivasan Iyengar¹, Stephen Lee¹, David Irwin¹, Prashant Shenoy¹•Institutions (1)

University of Massachusetts Amherst¹

16 Nov 2016

TL;DR: This paper conducts a wide-ranging analysis of the city's gas and electric data to gain insights into the energy consumption of both individual homes and the city as a whole and demonstrates how city-scale smart meter datasets can answer a variety of questions on building energy consumption.

...read moreread less

Abstract: Understanding the energy usage of buildings is crucial for policy-making, energy planning, and achieving sustainable development. Unfortunately, instrumenting buildings to collect energy usage data is difficult and all publicly available datasets typically include only a few hundred homes within a region. Due to their relatively small size, these datasets provide limited insight and are insufficient for analyses that require a larger representation, such as an entire city or town. In recent years, utility companies have installed advanced electric and gas meters, i.e., "smart meters" that enable energy data collection on a massive scale. In this paper, we analyze such a dataset from a utility company that includes energy data from 14,836 smart meters covering a small city. We conduct a wide-ranging analysis of the city's gas and electric data to gain insights into the energy consumption of both individual homes and the city as a whole. In doing so, we demonstrate how city-scale smart meter datasets can answer a variety of questions on building energy consumption, such as the impact of weather on energy usage, the correlation between the size and age of a building and its energy usage, the impact of increasing levels of renewable penetration, etc. For example, we show that extreme weather events significantly increase energy usage, e.g., by 36% and 11.5% on hot summer and cold winter days, respectively. As another example, we observe that 700 homes are highly energy inefficient as its energy demand variability is twice that of the aggregate grid demand. Finally, we study the impact of increasing level of renewable integration in homes and show that solar penetration rates higher than 20% of demand increases the risk of over-generation and may impact utility operations.

...read moreread less

27 citations

Proceedings Article•DOI•

SmartSim: A device-accurate smart home simulator for energy analytics

[...]

Dong Chen¹, David Irwin¹, Prashant Shenoy¹•Institutions (1)

University of Massachusetts Amherst¹

01 Nov 2016

TL;DR: SmartSim, a publicly-available device-accurate smart home energy trace generator, is developed and integrated with NILM-TK, a public-available toolkit for Non-Intrusive Load Monitoring (NILM), and compared with traces from a real home to show they yield similar quantitative and qualitative results for representative energy analytics.

...read moreread less

Abstract: Utilities have deployed tens of millions of smart meters, which record and transmit home energy usage at fine-grained intervals. These deployments are motivating researchers to develop new energy analytics that mine smart meter data to learn insights into home energy usage and behavior. Unfortunately, a significant barrier to evaluating energy analytics is the overhead of instrumenting homes to collect aggregate energy usage data and data from each device. As a result, researchers typically evaluate their analytics on only a small number of homes, and cannot rigorously vary a home's characteristics to determine what attributes of its energy usage affect accuracy. To address the problem, we develop SmartSim, a publicly-available device-accurate smart home energy trace generator. SmartSim generates energy usage traces for devices by combining a device energy model, which captures its pattern of energy usage when active, with a device usage model, which specifies its frequency, duration, and time of activity. SmartSim then generates aggregate energy data for a simulated home by combining the data from each device. We integrate SmartSim with NILM-TK, a publicly-available toolkit for Non-Intrusive Load Monitoring (NILM), and compare its synthetically generated traces with traces from a real home to show they yield similar quantitative and qualitative results for representative energy analytics.

...read moreread less

23 citations

Proceedings Article•DOI•

Analyzing the Efficiency of a Green University Data Center

[...]

Patrick Pegus¹, Benoy Varghese², Tian Guo¹, David Irwin¹, Prashant Shenoy¹, Anirban Mahanti², James Culbert³, John Goodhue³, Chris Hill⁴ - Show less +5 more•Institutions (4)

University of Massachusetts Amherst¹, NICTA², Massachusetts Green High Performance Computing Center³, Massachusetts Institute of Technology⁴

12 Mar 2016

TL;DR: A detailed analysis of a state-of-the-art 15MW green multi-tenant data center that incorporates many of the technological advances used in commercial data centers is presented, revealing the benefits of optimizations, and insights into how the various effectiveness metrics change with the seasons and increasing capacity usage are provided.

...read moreread less

Abstract: Data centers are an indispensable part of today's IT infrastructure. To keep pace with modern computing needs, data centers continue to grow in scale and consume increasing amounts of power. While prior work on data centers has led to significant improvements in their energy-efficiency, detailed measurements from these facilities' operations are not widely available, as data center design is often considered part of a company's competitive advantage. However, such detailed measurements are critical to the research community in motivating and evaluating new energy-efficiency optimizations. In this paper, we present a detailed analysis of a state-of-the-art 15MW green multi-tenant data center that incorporates many of the technological advances used in commercial data centers. We analyze the data center's computing load and its impact on power, water, and carbon usage using standard effectiveness metrics, including PUE, WUE, and CUE. Our results reveal the benefits of optimizations, such as free cooling, and provide insights into how the various effectiveness metrics change with the seasons and increasing capacity usage. More broadly, our PUE, WUE, and CUE analysis validate the green design of this LEED Platinum data center.

...read moreread less

17 citations

Journal Article•DOI•

Minimizing Transmission Loss in Smart Microgrids by Sharing Renewable Energy

[...]

Zhichuan Huang¹, Ting Zhu¹, David Irwin², Aditya Mishra³, Daniel Sadoc Menasche⁴, Prashant Shenoy² - Show less +2 more•Institutions (4)

University of Maryland, Baltimore County¹, University of Massachusetts Amherst², Seattle University³, Federal University of Rio de Janeiro⁴

19 Dec 2016-ACM Transactions on Cyber-Physical Systems

TL;DR: This article proposes an alternative structure where nearby homes explicitly share energy with each other to balance local energy harvesting and demand in microgrids, and develops a novel energy sharing approach to determine which homes should share energy, and when to minimize system-wide energy transmission losses in the microgrid.

...read moreread less

Abstract: Renewable energy (e.g., solar energy) is an attractive option to provide green energy to homes. Unfortunately, the intermittent nature of renewable energy results in a mismatch between when these sources generate energy and when homes demand it. This mismatch reduces the efficiency of using harvested energy by either (i) requiring batteries to store surplus energy, which typically incurs ∼ 20% energy conversion losses, or (ii) using net metering to transmit surplus energy via the electric grid’s AC lines, which severely limits the maximum percentage of renewable penetration possible. In this article, we propose an alternative structure where nearby homes explicitly share energy with each other to balance local energy harvesting and demand in microgrids. We develop a novel energy sharing approach to determine which homes should share energy, and when to minimize system-wide energy transmission losses in the microgrid. We evaluate our approach in simulation using real traces of solar energy harvesting and home consumption data from a deployment in Amherst, MA. We show that our system (i) reduces the energy loss on the AC line by 64% without requiring large batteries, (ii) performance scales up with larger battery capacities, and (iii) is robust to different energy consumption patterns and energy prediction accuracy in the microgrid.

...read moreread less

15 citations

Proceedings Article•DOI•

Non-intrusive model derivation: automated modeling of residential electrical loads

[...]

Srinivasan Iyengar¹, David Irwin¹, Prashant Shenoy¹•Institutions (1)

University of Massachusetts Amherst¹

21 Jun 2016

TL;DR: A Non-Intrusive Model Derivation (NIMD) algorithm is presented to automate modeling of residential electric loads using concepts from power systems, statistics, and machine learning to show that models derived via NIMD are comparable in accuracy to models built by experts and closely approximate the ground truth data.

...read moreread less

Abstract: A variety of energy management and analytics techniques rely on models of the power usage of a device over time. Unfortunately, the models employed by these techniques are often highly simplistic, such as modeling devices as simply being on with a fixed power usage or off and consuming little power. As we show, even the power usage of relatively simple devices exhibits much more complexity than a simple on and off state. To address the problem, we present a Non-Intrusive Model Derivation (NIMD) algorithm to automate modeling of residential electric loads using concepts from power systems, statistics, and machine learning. NIMD automatically derives a compact representation of the time-varying power usage of any residential electrical load, including both the device's energy usage and its pattern of usage over time. Such models are useful for a variety of analytics techniques, such as Non-Intrusive Load Monitoring, that have relied on simple on-off models in the past. We evaluate the accuracy of our models by comparing them with both actual ground truth data, and against models that have been designed manually by human experts. We show that models derived via NIMD are comparable in accuracy to models built by experts and closely approximate the ground truth data.

...read moreread less

Proceedings Article•

Cloud spot markets are not sustainable: the case for transient guarantees

[...]

Supreeth Subramanya¹, Amr Rizk¹, David Irwin¹•Institutions (1)

University of Massachusetts Amherst¹

20 Jun 2016

TL;DR: This work argues that price volatility will significantly decrease the value of spot servers as the spot market matures, and proposes a more sustainable alternative that offers a variable amount of idle capacity to users for a fixed price, but with transient guarantees.

...read moreread less

Abstract: Computational spot markets enable users to bid on servers, and then continuously allocates them to the highest bidder: if a user is "out bid" for a server, the market revokes it and re-allocates it to the new highest bidder. Spot markets are common when trading commodities to balance real-time supply and demand--cloud platforms use them to sell their idle capacity, which varies over time. However, server-time differs from other commodities in that it is "stateful": losing a spot server incurs an overhead that decreases the useful work it performs. Thus, variations in the spot price actually affect the inherent value of server-time bought in the spot market. As the spot market matures, we argue that price volatility will significantly decrease the value of spot servers. Thus, somewhat counter-intuitively, spot markets may not maximize the value of idle server capacity. To address the problem, we propose a more sustainable alternative that offers a variable amount of idle capacity to users for a fixed price, but with transient guarantees.

...read moreread less

Proceedings Article•DOI•

Beyond Energy-Efficiency: Evaluating Green Datacenter Applications for Energy-Agility

[...]

Supreeth Subramanya¹, Zain Mustafa¹, David Irwin¹, Prashant Shenoy¹•Institutions (1)

University of Massachusetts Amherst¹

12 Mar 2016

TL;DR: The results demonstrate the importance of energy-agile design when considering the benefits of using variable power, and show that GreenSort requires 31% more time and energy to complete when power varies based on real-time electricity prices versus when it is constant.

...read moreread less

Abstract: Computing researchers have long focused on improving energy-efficiency under the implicit assumption that all energy is created equal. Yet, this assumption is actually incorrect: energy's cost and carbon footprint vary substantially over time. As a result, consuming energy inefficiently when it is cheap and clean may sometimes be preferable to consuming it efficiently when it is expensive and dirty. Green datacenters adapt their energy usage to optimize for such variations, as reflected in changing electricity prices or renewable energy output. Thus, we introduce energy-agility as a new metric to evaluate green datacenter applications. To illustrate fundamental tradeoffs in energy-agile design, we develop GreenSort, a distributed sorting system optimized for energy-agility. GreenSort is representative of the long-running, massively-parallel, data-intensive tasks that are common in datacenters and amenable to delays from power variations. Our results demonstrate the importance of energy-agile design when considering the benefits of using variable power. For example, we show that GreenSort requires 31% more time and energy to complete when power varies based on real-time electricity prices versus when it is constant. Thus, in this case, real-time prices should be at least 31% lower than fixed prices to warrant using them.

...read moreread less

Proceedings Article•DOI•

AutoPlug: An automated metadata service for smart outlets

[...]

Lurdh Pradeep Reddy Ambati¹, David Irwin¹•Institutions (1)

University of Massachusetts Amherst¹

01 Jan 2016

TL;DR: This work proposes AutoPlug, a system that automatically identifies and tracks the devices plugged into smart outlets in real time without user intervention, and achieves ∼90% identification accuracy on real data collected from 13 distinct device types, while also detecting when a device changes outlets with an accuracy >90%.

...read moreread less

Abstract: Low-cost network-connected smart outlets are now available for monitoring, controlling, and scheduling the energy usage of electrical devices. As a result, such smart outlets are being integrated into automated home management systems, which remotely control them by analyzing and interpreting their data. However, to effectively interpret data and control devices, the system must know the type of device that is plugged into each smart outlet. Existing systems require users to manually input and maintain the outlet metadata that associates a device type with a smart outlet. Such manual operation is time-consuming and error-prone: users must initially inventory all outlet-to-device mappings, enter them into the management system, and then update this metadata every time a new device is plugged in or moves to a new outlet. Inaccurate metadata may cause systems to misinterpret data or issue incorrect control actions. To address the problem, we propose AutoPlug, a system that automatically identifies and tracks the devices plugged into smart outlets in real time without user intervention. AutoPlug combines machine learning techniques with time-series analysis of device energy data in real time to accurately identify and track devices on startup, and as they move from outlet-to-outlet. We show that AutoPlug achieves ∼90% identification accuracy on real data collected from 13 distinct device types, while also detecting when a device changes outlets with an accuracy >90%. We implement an AutoPlug prototype on a Raspberry Pi and deploy it live in a real home for a period of 20 days. We show that its performance enables it to monitor up to 25 outlets, while detecting new devices or changes in devices with <50s latency.

...read moreread less

Proceedings Article•DOI•

SunShade: software-defined solar systems

[...]

Akansha Singh¹, Stephen Lee¹, David Irwin¹, Prashant Shenoy¹•Institutions (1)

University of Massachusetts Amherst¹

21 Jun 2016

TL;DR: The prototype SDS system, called SunShade, includes two new mechanisms that enable programmatic solar flow control: one that enforces an absolute limit on solar output, and one thatEnforces a relative limit onSolar output as a fraction of the current maximum power point.

...read moreread less

Abstract: Since the electric grid was not designed to support large-scale solar generation, current policies place hard caps on the number of solar systems that connect to the grid. Unfortunately, users are starting to hit these caps, which is restricting solar's natural growth. Software-defined solar (SDS) systems address the problem by dynamically regulating the power they inject into the grid, similar to TCP, to maximize the grid's available solar capacity, maintain grid stability, and fairly share the grid's solar capacity among users. By dynamically regulating solar "flows," SDS systems remove the need for policies that artificially cap solar systems, enabling any SDS system to freely connect to the grid. Our prototype SDS system, called SunShade, includes two new mechanisms that enable programmatic solar flow control: one that enforces an absolute limit on solar output, and one that enforces a relative limit on solar output as a fraction of the current maximum power point. We have implemented both mechanisms, and conducted a preliminary evaluation with an emulated solar panel using real weather traces with different insolation and temperature levels.

...read moreread less