scispace - formally typeset
Open AccessProceedings ArticleDOI

The Impact of Mobile Multimedia Applications on Data Center Consolidation

TLDR
In this paper, the authors present quantitative evidence that this crucial design consideration to meet interactive performance criteria limits data center consolidation and describe an architectural solution that is a seamless extension of today's cloud computing infrastructure.
Abstract
The convergence of mobile computing and cloud computing enables new multimedia applications that are both resource-intensive and interaction-intensive. For these applications, end-to-end network bandwidth and latency matter greatly when cloud resources are used to augment the computational power and battery life of a mobile device. We first present quantitative evidence that this crucial design consideration to meet interactive performance criteria limits data center consolidation. We then describe an architectural solution that is a seamless extension of today's cloud computing infrastructure.

read more

Content maybe subject to copyright    Report

The Impact of Mobile Multimedia Applications on
Data Center Consolidation
Kiryong Ha, Padmanabhan Pillai
, Grace Lewis
,
Soumya Simanta
, Sarah Clinch
, Nigel Davies
, Mahadev Satyanarayanan
October 2012
CMU-CS-12-143
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
Intel Labs,
Software Engineering Institute,
Lancaster University
Abstract
The convergence of mobile computing and cloud computing enables new multimedia applications that are both resource-
intensive and interaction-intensive. For these applications, end-to-end network bandwidth and latency matter greatly when
cloud resources are used to augment the computational power and battery life of a mobile device. We first present quantitative
evidence that this crucial design consideration to meet interactive performance criteria limits data center consolidation. We
then describe an architectural solution that is a seamless extension of today’s cloud computing infrastructure.
Copyright 2012 Carnegie Mellon University
This research was supported by the National Science Foundation (NSF) under grant numbers CNS-0833882 and IIS-1065336, by
an Intel Science and Technology Center grant, and by the Department of Defense (DoD) under Contract No. FA8721-05-C-0003 for
the operation of the Software Engineering Institute (SEI), a federally funded research and development center. Any opinions, findings,
conclusions or recommendations expressed in this material are those of the authors and do not necessarily represent the views of the NSF,
Intel, DoD, SEI, or Carnegie Mellon University. This material has been approved for public release and unlimited distribution except as
restricted by copyright.

Keywords: mobile computing, cloud computing, cyber foraging, smartphones, virtual machines, system ar-
chitecture, cloudlets, disconnected operation, energy efficiency, battery life, face recognition, speech recognition,
object recognition, augmented reality, simulation-based graphics

1 Introduction
The convergence of cloud computing and mobile computing has begun. Apple’s Siri for the iPhone [1], which
performs compute-intensive speech recognition in the cloud, hints at the rich commercial opportunities in this
emerging space. Rapid improvements in sensing, display quality, connectivity, and compute power of mobile
devices will lead to new cloud-enabled mobile applications that embody voice-, image-, motion- and location-
based interactivity. Siri is just the leading edge of this disruptive force.
Many of these new applications will be interactive as well as resource-intensive, pushing well beyond the
processing, storage, and energy limits of mobile devices. When their use of cloud resources is in the critical path
of user interaction, end-to-end operation latencies can be no more than a few tens of milliseconds. Violating this
bound results in distraction and annoyance to a mobile user who is already attention-challenged. Such fine-grained
cloud usage is different from the coarse-grained usage models and SLA guarantees of cloud computing today.
The central contribution of this paper is the experimental evidence that these new applications force a funda-
mental change in cloud computing architecture. We describe five example applications of this genre in Section 2,
and experimentally demonstrate in Section 3 that even with the rapid improvements predicted for mobile computing
hardware, such applications will benefit from cloud resources. The remainder of the paper explores the architectural
implications of this class of applications. In the past, centralization was the dominant theme of cloud computing.
This is reflected in the consolidation of dispersed compute capacity into a few large data centers. For example,
Amazon Web Services spans the entire planet with just a handful of data centers located in Oregon, N. Califor-
nia, Virginia, Ireland, Singapore, Tokyo, and S
˜
ao Paolo. The underlying value proposition of cloud computing is
that centralization exploits economies of scale to lower the marginal cost of system administration and operations.
These economies of scale evaporate if too many data centers have to be maintained and administered.
Aggressive global consolidation of data centers implies large average separation between a mobile device and
its cloud. End-to-end communication then involves many network hops and results in high latencies, as quantified
in Section 4 using measurements from Amazon EC2. Under these conditions, achieving crisp interactive response
for latency-sensitive mobile applications will be a challenge. Limiting consolidation and locating small data centers
much closer to mobile devices would solve this problem, but it would sacrifice the key benefit of cloud computing.
How do we achieve the right balance? Can we support latency-sensitive and resource-intensive mobile ap-
plications without sacrificing the consolidation benefits of cloud computing? Section 5 shows how a two-level
architecture can reconcile this conflict. The first level of this hierarchy is today’s unmodified cloud infrastructure.
The second level is new. It consists of dispersed but unmanaged infrastructure with no hard state. Each second-
level element is effectively a “second-class data center” with soft state generated locally or cached on demand from
the first level. Data center proximity to mobile devices is thus achieved by the second level without limiting the
consolidation achievable at the first level. Communication between first and second levels is outside the critical
path of interactive mobile applications. This hierarchical structure also has an additional benefit. As discussed in
Section 6, it improves availability when cloud connectivity is fragile and prone to disruption.
Throughout this paper, the term “cloud computing” refers to transient use of computational cloud resources by
mobile clients. Other forms of cloud usage such as processing of large datasets (data-intensive computing) and
asynchronous long-running computations (agent-based computing) are outside the scope of this paper.
2 Mobile Multimedia Applications
Beyond today’s familiar desktop, laptop, and smartphone applications is a new genre of software to seamlessly
augment human perception and cognition. Consider Watson, IBM’s question-answering technology that publicly
demonstrated its prowess in 2011 [2]. Imagine such a tool being available anywhere and anytime to rapidly respond
to urgent questions posed by an attention-challenged mobile user. Such a vision may be within reach in the next
decade. Free-form speech recognition, natural language translation, face recognition, object recognition, dynamic
1

action interpretation from video, and body language interpretation are other examples of this genre of futuristic
applications. Although a full-fledged cognitive assistance system is out of reach today, we investigate several
smaller applications that are building blocks towards this vision. Five such applications are described below.
2.1 Face Recognition (FACE)
A most basic and fundamental perception task is the recognition of human faces. The problem has been long studied
in the computer vision community, and fast algorithms for detecting human faces in images have been available for
some time [3]. Identification of individuals through computer vision is still an area of active research, spurred by
applications in security and surveillance tasks. However, such technology is also very useful in mobile devices for
personal information management and cognitive assistance. For example, an application that can recognize a face
and remind you who it is (by name, contact information, or context in which you last met) can be quite useful to
everyone, and invaluable to those with cognitive or visual impairments. Such an application is most useful if it can
be used anywhere, and can quickly provide a response to avoid potentially awkward social situations.
The face recognition application studied here detects faces in an image, and attempts to identify the face from
a prepopulated database. The application uses a Haar Cascade of classifiers to do the detection, and then uses the
Eigenfaces method [4] based on principal component analysis (PCA) to make an identification. The implementa-
tion is based on OpenCV [5] image processing and computer vision routines, and runs on a Microsoft Windows
environment. Training the classifiers and populating the database are done offline, so our experiments only consider
the execution time of the recognition task on a pre-trained system.
2.2 Speech Recognition (SPEECH)
Speech as a modality of interaction between human users and computers is a long studied area of research. Most
success has been in very specific domains or in applications requiring a very limited vocabulary, such as interactive
voice response in phone answering services, and hands-free, in-vehicle control of cell phones. Several recent com-
mercial efforts aim for general purpose information query, device control, and language translation using speech
input on mobile devices [1, 6, 7].
The speech recognition application considered here is based on an open-source speech-to-text framework based
on Hidden Markov Model (HMM) recognition systems [8]. It takes as input digitized audio of a spoken English
sentence, and attempts to extract all of the words in plain text format. This application is single-threaded. Since it
is written in Java, it can run on both Linux and Microsoft Windows. For this paper, we ran it on Linux.
2.3 Object and Pose Identification (OBJECT)
A third application is based on a computer vision algorithm originally developed for robotics [9], but modified for
use by handicapped users. The computer vision system identifies known objects, and importantly, also recognizes
the position and orientation of the objects relative to the user. This information is then used to guide the user in
manipulating a particular object.
Here, the application identifies and locates known objects in a scene. The implementation runs on Linux, and
makes use of multiple cores. The system extracts key visual elements (SIFT features [10]) from an image, matches
these against a database of features from a known set of objects, and finally performs geometric computations
to determine the pose of the identified object. For the experiments in this paper, the database is populated with
thousands of features extracted from more than 500 images of 13 different objects.
2.4 Mobile Augmented Reality (AUGREAL)
The defining property of a mobile augmented reality application is the display of timely and relevant information
as an overlay on top of a live view of some scene. For example, it may show street names, restaurant ratings
2

Typical Server Typical Handheld
Year Processor Speed Device Speed
1997 Pentium II 266 MHz PalmPilot 16 MHz
2002 Itanium 1 GHz Blackberry 133 MHz
5810
2007 Core 2 9.6 GHz Apple 412 MHz
(4 cores) iPhone
2011 Xeon X5 32 GHz Samsung 2.4 GHz
(2x6 cores) Galaxy S2 (2 cores)
Figure 1: Evolution of Hardware Performance (adapted from Flinn [15])
or directional arrows overlaid on the scene captured through a smartphone’s camera. Special mobile devices that
incorporate cameras and see-through displays in a wearable form factor [11] can be used instead of a smartphone.
AUGREAL uses computer vision to identify actual buildings and landmarks in a scene, and label them precisely
in the view [12]. This is akin to an image-based query in Google Goggles [13], but running continuously on a
live video stream. AUGREAL extracts a set of features from the scene image, and uses the feature descriptors to
find similar-looking entries in a database constructed using features from labeled images of known landmarks and
buildings. The database search is kept tractable by spatially indexing the data by geographic locations, and limiting
search to a slice of the database relevant to the current GPS coordinates. The prototype uses 1005 labeled images
of 200 buildings as the relevant database slice. AUGREAL runs on Microsoft Windows, and makes significant use
of OpenCV libraries [5], Intel Performance Primitives (IPP) libraries, and multiple processing threads.
2.5 Physical Simulation and Rendering (FLUID)
Our final application is used in computer graphics. Using accelerometer readings from a mobile device, it phys-
ically models the motion of imaginary fluids with which the user can interact. For example, it can show liquid
sloshing around in a container depicted on a smartphone screen, such as a glass of water carried by the user as he
walks or runs. The application backend runs a physics simulation, based on the predictive-corrective incompress-
ible smoothed particles hydrodynamics (PCISPH) method [14]. We note that the computational structure of this
application is representative of many other applications, particularly “real-time” (i.e., not turn-based) games.
FLUID is implemented as a multithreaded Linux application. To ensure a good interactive experience, the
delay between user input and output state change has to be very low, on the order of 100ms. In our experiments,
FLUID simulates a 2218 particle system with 20 ms timesteps, generating up to 50 frames per second.
3 Why Cloud Resources are Necessary
3.1 Mobile Hardware Performance
Handheld or body-worn mobile devices are always resource-poor relative to server hardware of comparable vin-
tage [16]. Figure 1, adapted from Flinn [15], illustrates the consistent large gap in the processing power of typical
server and mobile device hardware over a 15-year period. This stubborn gap reflects a fundamental reality of user
preferences: Moore’s Law has to be leveraged differently on hardware that people carry or wear for extended pe-
riods of time. This is not just a temporary limitation of current mobile hardware technology, but is intrinsic to
3

Citations
More filters
Journal ArticleDOI

The Emergence of Edge Computing

TL;DR: A five-video playlist demonstrating proof-of-concept implementations for three tasks: assembling 2D Lego models, freehand sketching, and playing Ping-Pong is demonstrated.
Journal ArticleDOI

How Can Edge Computing Benefit From Software-Defined Networking: A Survey, Use Cases, and Future Directions

TL;DR: A clear collaboration model for the SDN-Edge Computing interaction is put forward through practical architectures and it is shown that SDN related mechanisms can feasibly operate within the Edge Computing infrastructures.
Journal ArticleDOI

Smart Resource Allocation for Mobile Edge Computing: A Deep Reinforcement Learning Approach

TL;DR: A smart, Deep Reinforcement Learning based Resource Allocation (DRLRA) scheme, which can allocate computing and network resources adaptively, reduce the average service time and balance the use of resources under varying MEC environment is proposed.
Journal ArticleDOI

Potentials, trends, and prospects in edge technologies: Fog, cloudlet, mobile edge, and micro data centers

TL;DR: This survey presents a detailed overview of potentials, trends, and challenges of edge Computing, and illustrates a list of most significant applications and potentials in the area of edge computing.
Journal ArticleDOI

Mobile Edge Cloud Network Design Optimization

TL;DR: This work compares two VM mobility modes, bulk and live migration, as a function of mobile cloud service requirements, determining that a high preference should be given to live migration and bulk migrations seem to be a feasible alternative on delay-stringent tiny-disk services, such as augmented reality support, and only with further relaxation on network constraints.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Journal ArticleDOI

Eigenfaces for recognition

TL;DR: A near-real-time computer system that can locate and track a subject's head, and then recognize the person by comparing characteristics of the face to those of known individuals, and that is easy to implement using a neural network architecture.
Proceedings ArticleDOI

Robust real-time face detection

TL;DR: A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.
Journal ArticleDOI

The Case for VM-Based Cloudlets in Mobile Computing

TL;DR: The results from a proof-of-concept prototype suggest that VM technology can indeed help meet the need for rapid customization of infrastructure for diverse applications, and this article discusses the technical obstacles to these transformations and proposes a new architecture for overcoming them.
Related Papers (5)
Frequently Asked Questions (10)
Q1. What contributions have the authors mentioned in the paper "The impact of mobile multimedia applications on data center consolidation" ?

The authors first present quantitative evidence that this crucial design consideration to meet interactive performance criteria limits data center consolidation. The authors then describe an architectural solution that is a seamless extension of today ’ s cloud computing infrastructure. This research was supported by the National Science Foundation ( NSF ) under grant numbers CNS-0833882 and IIS-1065336, by an Intel Science and Technology Center grant, and by the Department of Defense ( DoD ) under Contract No. FA8721-05-C-0003 for the operation of the Software Engineering Institute ( SEI ), a federally funded research and development center. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily represent the views of the NSF, Intel, DoD, SEI, or Carnegie Mellon University. 

The underlying value proposition of cloud computing is that centralization exploits economies of scale to lower the marginal cost of system administration and operations. 

The most sought-after features of a mobile device include light weight, small size, long battery life, comfortable ergonomics, and tolerable heat dissipation. 

Lewis et al. [23] report that human subjects take less than 700 milliseconds to determine the absence of faces in a scene, even under hostile conditions such as low lighting and deliberately distorted optics. 

The emergence of such applications can be accelerated by deploying infrastructure that assures mobile users of continuous logical proximity to the cloud. 

The authors measured end-to-end latency and bandwidth to these data centers from a WiFi-connected mobile device located on their campuses in Pittsburgh, PA and Lancaster, UK. 

The central contribution of this paper is the experimental evidence that these new applications force a fundamental change in cloud computing architecture. 

Li et al. [24] report that average round trip time (RTT) from 260 global vantage points to their optimal Amazon EC2 instances is 73.68 ms. 

Since peak processor power consumption exceeds wireless power consumption on today’s high-end mobile devices, this tradeoff favors cloud processing as computational demands increase. 

The situation is analogous to the dawn of personal computing, when the dramatic lowering of user interaction latency relative to time-sharing led to entirely new application metaphors such as spreadsheets and WYSIWYG editors.