scispace - formally typeset
Open AccessProceedings ArticleDOI

FLOWPROPHET: Generic and Accurate Traffic Prediction for Data-Parallel Cluster Computing

TLDR
This paper designs and implements FLOWPROPHET, a general framework to predict traffic flows for DCFs, and demonstrates that it can achieve almost 100% accuracy in source, destination, and flow size predictions.
Abstract
Data-parallel computing frameworks (DCF) such as MapReduce, Spark, and Dryad etc. Have tremendous applications in big data and cloud computing, and throw tons of flows into data center networks. In this paper, we design and implement FLOW PROPHET, a general framework to predict traffic flows for DCFs. To this end, we analyze and summarize the common features of popular DCFs, and gain a key insight: since application logic in DCFs is naturally expressed by directed acyclic graphs (DAG), DAG contains necessary time and data dependencies for accurate flow prediction. Based on the insight, FLOW PROPHET extracts DAGs from user applications, and uses the time and data dependencies to calculate flow information 4-tuple, (source, destination, flow size, establish time), ahead-of-time for all flows. We also provide generic programming interface to FLOW PROPHET, so that current and future DCFs can deploy FLOW PROPHET readily. We implement FLOW PROPHET on both Spark and Hadoop, and perform extensive evaluations on a testbed with 37 physical servers. Our implementation and experiments demonstrate that, with time in advance and minimal cost, FLOW PROPHET can achieve almost 100% accuracy in source, destination, and flow size predictions. With accurate prediction from FLOW PROPHET, the job completion time of a Hadoop TeraSort benchmark is reduced by 12.52% on our cluster with a simple network scheduler.

read more

Content maybe subject to copyright    Report

FLOWPROPHET: Generic and Accurate Traffic
Prediction for Data-parallel Cluster Computing
Hao Wang
SJTU and HKUST
Li Chen
HKUST
Kai Chen
HKUST
Ziyang Li
NUDT and HKUST
Yiming Zhang
NUDT
Haibing Guan
SJTU
Zhengwei Qi
SJTU
Dongsheng Li
NUDT
Yanhui Geng
Huawei
Abstract—Data-parallel computing frameworks (DCF) such as
MapReduce, Spark, and Dryad etc. have tremendous applications
in big data and cloud computing, and throw tons of flows into
data center networks. In this paper, we design and implement
FLOWPROPHET, a general framework to predict traffic flows for
DCFs. To this end, we analyze and summarize the common fea-
tures of popular DCFs, and gain a key insight: since application
logic in DCFs is naturally expressed by directed acyclic graphs
(DAG), DAG contains necessary time and data dependencies for
accurate flow prediction. Based on the insight, FLOWPROPHET
extracts DAGs from user applications, and uses the time and data
dependencies to calculate flow information 4-tuple, (source,
destination, flow_size, establish_time), ahead-of-
time for all flows. We also provide generic programming interface
to FLOWPROPHET, so that current and future DCFs can deploy
FLOWPROPHET readily. We implement FLOWPROPHET on both
Spark and Hadoop, and perform extensive evaluations on a
testbed with 37 physical servers. Our implementation and exper-
iments demonstrate that, with time in advance and minimal cost,
FLOWPROPHET can achieve almost 100% accuracy in source,
destination, and flow size predictions. With accurate prediction
from FLOWPROPHET, the job completion time of a Hadoop
TeraSort benchmark is reduced by 12.52% on our cluster with
a simple network scheduler.
I. INTRODUCTION
Data-parallel computing frameworks (DCFs) such as
MapReduce [1], Dryad [2], Spark [3], etc. have tremendous
applications, especially in big data and cloud computing.
DCFs greatly enhance programmers’ productivity by abstract-
ing away implementation details, so that the programmers can
focus on the application logic without worrying about resource
contention, task distribution, and so on. They only need to
apply the APIs (e.g., filter(), map(), reduce()) to
express their logic and manipulate their dataset as if on a single
machine.
DCFs effectively decouple the detailed distributed comput-
ing implementation from the user programs. However, lower
level implementation details hold the key to better application
performance, and lots of research efforts have been spent
along this direction recently. On the micro level, flow-based
optimization mechanisms (e.g., [4]–[8]) attempt to minimize
average completion time of flows or groups of flows by
exploiting flow size provided by the applications. On the macro
level, architectural bandwidth provisioning (e.g., [9]–[12]) and
traffic engineering (e.g., [13]–[15]) solutions try to estimate
This work was performed when Hao Wang and Ziyang Li were intern
students at SING Group @ HKUST.
aggregate application traffic demands to enable dynamic net-
work resource allocation. Note that both approaches depend
on predicting the future: the traffic and flow information has
to be known ahead-of-time.
Predicting the future is inherently difficult, and most exist-
ing solutions settle on using heuristic algorithms or measuring
network level parameters, such as flow counters [9, 13] and
socket buffer occupancy [10, 16]. However, these methods
are in essence reacting to traffic, rather than predicting, and
therefore result in poor performance [17].
More recently, an application level traffic forecasting so-
lution, HadoopWatch [18], derives traffic through measuring
task assignments and data size indications on file systems
at the master and worker nodes in Hadoop. However, this
method is customized for Hadoop, and only works when the
underlying application logic is as simple as Hadoop, which can
be described in 2 stages: map and reduce. When the application
logic becomes more complex, this method is uncontrollable
and inaccurate (or incorrect) because it does not know where,
what and when to collect useful information. For example,
in Spark [3], there are multiple stages, and stages that are
consecutive in time may or may not have data dependencies
when doing lazy evaluation [3]. In fact, accurate traffic pre-
diction requires the knowledge of time and data dependencies,
which are closely related to the applications logic and the
corresponding representations of DCFs.
In this paper, we seek a generic and accurate method to
predict flow information for data-parallel cluster computing
frameworks. We specifically set our design goals as follows:
Generic: We should devise a general interface for traffic
prediction that works for all current and future DCFs. To
this end, we should have a general description of application
execution patterns in order to express complex application
logic.
Accurate and fined-grained: The method must be able to
provide accurate flow level information, rather than coarse
aggregated traffic demand, to enable fine-grained network
control and optimization. The method should also provide
detailed inter-flow dependency information to feed recent
coflow optimizations [7, 8].
Ahead-of-time: The method must be able to predict the
flows before they enter the network; and ideally it should
also estimate the flow establish_time accurately.
Scalable and low-overhead: The method should be able
to work at large scale and introduce as little overhead to the
DCFs as possible.

In essence, we aim to calculate the 4-tuple (source,
destination, flow_size, establish_time) for each
flow. The intention of the first three elements is straightfor-
ward. The establish_time is used to determine the exact
time when a flow will establish (e.g., a network scheduler
will need to make a scheduling decision before the flow
establishes). Thus, we need to know both the logical order of
data processing and the locations and sizes of data partitions.
To this end, we examine prevalent DCFs and identify the key
observation (details in Section II): since application logic is
naturally represented by directed acyclic graphs (DAG) in all
DCFs, DAG contains necessary time and data dependencies
for accurate flow prediction. With DAG, we can explicitly
know where, what, and when information to measure in order
to accurately calculate flow information for complex parallel
computing applications.
Based on the insights, we present FLOWPROPHET, a
general framework to predict flow information for all DCFs.
FLOWPROPHET extracts DAG from data-parallel applications,
then uses the DAG to guide the measurement and prediction. In
the course of design and implementation of FLOWPROPHET,
we make the following contributions:
We analyze and summarize the common execution patterns
of popular computing frameworks, and extract DAG to
obtain time and data dependencies from applications using
these frameworks to guide the flow prediction.
We design FLOWPROPHET, a lightweight, generic, and
accurate flow information prediction framework for DCFs.
The application programming interface (API) of FLOW-
PROPHET is general, so that existing and future computing
frameworks can readily use FLOWPROPHET to generate
accurate flow information.
We have implemented FLOWPROPHET on the most pop-
ular frameworks such as Hadoop and Spark, and build a
real testbed with 37 servers to evaluate it. Our experiments
show that with time in advance and negligible overhead
to application performance, FLOWPROPHET can achieve
almost 100% accuracy in source, destination, and flow size
predictions.
Using accurate prediction from FLOWPROPHET, we show
that even a simple network level optimization can greatly
improve application performance. In our experiment, the job
completion time of a Hadoop TeraSort-25G benchmark is
reduced by 12.52% on our 37-server cluster.
The rest of this paper is organized as follows. Section II
introduces the key observation that motivates us to leverage
DAG to predict flow information. Section III presents the
design and implementation of FLOWPROPHET. Section IV
discusses the evaluation benchmarks and results of FLOW-
PROPHET. Section V reviews the related works. Section VI
concludes the paper.
II. DAG-ASSISTED FLOW PREDICTION
In this section, we examine how DAG assists the calcula-
tion of flow information (summarized in Figure 1). We first
delve into the typical application life-cycle in popular DCFs,
and then establish the relationships between application logic,
execution sequence, DAG, and data movement. Finally, we
demonstrate the practical calculation steps of flow information
prediction using DAG.
Time Dependency
Data Dependency
Application
Submit
job#1 job#n
Master
Worker#1
Worker#2
Worker#n
data transfer
dependency stage
data partition
tasks
4
5
2
3
1
0
Task
Assignment
Stage#5
Stage#4
job#2
Fig. 1: Data-parallel computing framework: application
logic and data movement.
Application Life-cycle: In DCFs, there is a gap between
the application logic and the actual operations in the backend
cluster, which may contain thousands of CPU cores, because
user application only concerns with a single machine during
development to lower complexity. To achieve scalable perfor-
mance, DCFs automatically discover and exploit parallelism
from user’s application logic, and distribute parallel computa-
tional tasks to every computing node.
The life-cycle of a user application is described in Figure 1.
At the start, user application is resolved into jobs
1
. For each
job, DCFs calculate the order of executions and data depen-
dency, which can be described by a DAG, as shown in Figure 1.
Specifically, DCFs identify which tasks have dependency on
which data partition, and plan the parallel executions of the
application. These tasks are aggregated into a stage. Then,
the tasks in a stage are assigned to workers, and the parallel
operations on the dataset are launched. The nodes in DAG are
stages, and the arcs represent dependency between stages. Data
transfer occur only during stage transitions.
Almost all popular DCFs describe their operations in
DAGs. For example, Dryad’s [2] execution engine is driven by
a graph description language, which empowers the developer
with explicit graph construction. Pregel [20], which is based on
Bulk Synchronous Parallel (BSP), adopts a sequence of super-
steps to construct user application. Every superstep contains a
data communication phase and a barrier synchronization phase,
which is essentially a DAG with two vertices and one edge.
Spark [3] defines a novel structure named Resilient Distributed
Dataset (RDD) which expresses DAG with RDD lineage.
Spark provides transformations, and actions (e.g., union(),
join(), filter(), map(), take(), etc.) to build RDD
lineage and explicitly express algorithm logic. Compared with
previous framework, MapReduce [1] (or Hadoop [19]) is much
simpler. Its two primitive semantics: map and reduce can
also be regarded as a DAG contains only two vertices and one
edge. CIEL [21] develops a language named Skywriting [22]
and a series of operators (e.g., exec(), spawn(), map(),
etc.) to express task-level parallelism in DAG.
1
Iterative applications with termination criterions will be divided into
dependent jobs: each will check the termination criterion to decide whether
to move on to the next.

map tasks
reduce tasks
input data
output data
(a) Data shuffle between mappers
and reduces in Hadoop [19]
…………….
…………….
n
n
input data
output files
computing
vertices
…………….
n
(b) Data channels between computing
vertices in Dryad [2]
supserstep(i)
barrier synchronization
computing
nodes
computing
nodes
(c) Data communication in one su-
perstep of Bulk Synchronous Paral-
lel (BSP) in Pregel [20]
stage #3
input data
stage #1
stage #2
stage #0
output data
tasks
(d) Data shuffle between stages in
Spark [3]
Fig. 2: Data movement patterns.
Observation: DAG contains necessary time, data, and flow
dependencies for accurate flow prediction.
Time dependency: Time-dependency refers to the execution
order of stages. DCFs process the DAG one node (stage) at
a time in a depth-first-traversal order [3], and generate this
order. Stages may execute parallel in time, while others have to
wait for completion of parent stages. Traffic is only generated
between parent and child stages, and with DAG, we know
when the flow transmission will occur.
Data dependency: DCFs maintain the life cycle of data:
import, transfer, storage and export. First, data imported into
the cluster will be split and distributed to the entire cluster.
Then, DCFs assign computation tasks to each node based on
data locality and resource scheduling scheme. Along with the
execution of computation tasks, intermediate data is generated
and cached locally. In Hadoop (Figure 2(a)), a JobTracker
informs reducers when and where (i.e. which mapper node) to
fetch data to perform reduce tasks. In Dryad, data channels
are maintained between computing vertices (Figure 2(b)),
and data flows along these channels. For Pregel, a superstep
requires all the computing node to exchange data by barrier
synchronization before the next superstep (Figure 2(c)). For
Spark, data shuffle takes place between specific stages based
on the dependency recorded in RDD lineage (Figure 2(d)).
In summary, since every process of the data life cycle is
conducted by DCFs, DCFs are capable of exporting location
and size of every piece of intermediate data and final results.
Since traffic is essentially data movement, flow prediction
requires knowing where, what and when the data is moved,
and such information can be retrieved from the DAG. When a
stage (a node in DAG) relies on the output of a group of stages
(every stage in this group is called the stage’s parent), it has
to wait until all the parents are finished. Concurrently running
stages do not have data dependency on each other. Thus, we
can infer from the DAG the source (parent stages), destination
(child stage), size (amount of data required), and time (upon
completion of all parent stages) of the transmission of data
between stage transitions.
Flow dependency: The data flows generated between consec-
utive stages are inter-dependent, because they usually share
common communication requirements and objectives (Fig-
ure 2). Flow dependency refers to an important concept of
coflow [23], which defines a semantically related collection of
flows. We observe that edges in DAG can be naturally used to
identify coflows in DCFs, which provides valuable information
for coflow-based optimization mechanisms such as [7, 8].
Calculating flow information with DAG: Inspired by
our observations, we can design a general method to cal-
culate flow information 4-tuple, (source, destination,
flow_size, establish_time) by developing a set of
interfaces to: 1) output stage context
2
, and to 2) extract
locations and sizes of data partitions.
t
Flow
Prediction Output Time
establish_time
Flow Start Time Flow End Time
Fig. 3: An example of establish_time
At the high level, the 4-tuple is calculated as follows
(detailed design and implementation in Section III):
source: we look for the current stages in DAG, and
identify the data partitions that need to be transferred. The
worker node containing the data is the source.
destination: we look for next stages in DAG, and
identify which worker node will work on which piece of
data. Thus, the destinations of the data can be identified.
flow_size: we use the interface to look up sizes of data
partitions to be transmitted.
establish_time: as depicted in Figure 3, FLOW-
PROPHET outputs prediction information of a flow at the
Prediction Output Time, and the flow begins at the Flow
Start Time. The establish_time is defined as the time
period between the Prediction Output Time and the Flow
Start Time. We develop a heuristic algorithm to estimate the
expected establishing time intervals for subsequent flows.
This algorithm is adaptive to the application and the DCF.
III. FLOWPROPHET DESIGN AND IMPLEMENTATION
We introduce the design and implementation of FLOW-
PROPHET in this section. First, we dissect the flow information
prediction in DCFs into several sub-problems, and describe our
solutions III-A). Then, we present the workflow of FLOW-
PROPHET to show how different components work together
2
Stage context includes current stage, next stage, and the dependency
between them.

Flow
Calculator
Data
Aggregator
Spark
Worker
Hadoop
Worker
Ciel
Worker
DAG Builder
Write
Data Tracker
Fetch
Master Node
Local
Memory
Local
Disk
Network
Interface
Worker Node
Spark
Master
Hadoop
Master
Ciel
Master
Data
Status
Task List Stage ID
Data Status
List
Fig. 4: The architecture of FLOWPROPHET.
III-B). Finally, we go through the implementation details of
each component of FLOWPROPHET in § III-C.
A. FLOWPROPHET Overview
Figure 4 depicts the architecture of FLOWPROPHET, which
contains 4 modules: DAG Builder, Data Tracker, Data Ag-
gregator, and Flow Calculator (functions explained below).
FLOWPROPHET is attached to DCFs to enable flow pre-
diction. When implementing a general framework to pre-
dict the 4-tuple (source, destination, flow_size,
establish_time) for every upcoming flow in DCFs, we
are essentially solving the following sub-problems:
How to extract the full DAG? The DAG is the pivot for
predicting flow information for DCFs. On the master node of
DCFs, the DAG Builder builds a full DAG by parsing event
messages from the DCF master interfaces.
How to collect data partition status? When a stage is com-
pleted, the computation result is kept as a data partition in local
disk or local memory of each worker node separately. A data
partition status contains the stage_ID, partition_ID and
size. The Data Tracker receives event messages from DCF
worker interfaces and maintains a data structure to record all
data partition status. The Data Aggregator requests the status
of each data partition from the Data Tracker on each worker.
How to be scalable and lightweight? We pursue scalability
and low-overhead in the design of FLOWPROPHET. All mod-
ules in FLOWPROPHET follow the principles of Actor Model
to exchange messages. The Actor Model is an asynchronous
programming model for distributed applications [24]. The
actors are fairly lightweight concurrent entities. They process
messages asynchronously using an event-driven receive loop.
The Actor Model is capable of offering a high level of
abstraction for achieving high concurrency and parallelism.
B. FLOWPROPHET Workflow
Figure 5 depicts how modules of FLOWPROPHET coop-
erate to predict upcoming flows when a stage is finished.
When the DAG Builder receives a message that current stage
is finished, the DAG Builder checks whether there will be
traffic between the current stage and the next stage. If yes,
the DAG Builder will send the current stage ID to ask the
Data Aggregator to collect data partition status from each
Data Tracker. After the Data Aggregator finishes the collection,
FLOWPROPHET knows the locations and sizes of all data par-
titions. Then, when DAG Builder is notified that a new stage is
DAG Builder
Data Aggregator Data Tracker
Flow Calculator
currentStageID
currentStageID
List[DataPartitionStatus]
List[(Location, Size)]
List[task],
List[ParentStageID]
currentStage
Finished
Flow info.
List[partitionID]
t
List[FailedTaskInfo]
Extra Flow info.
taskFailure
nextStage
Start
Fig. 5: Sequence diagram begins with an event that current
stage is finished.
beginning, it will send the stage context to the Flow Calculator.
The stage context contains the tasks and parent stage IDs of
the next stage. Each task is identified by (partition_ID,
executor_ID, func). The Flow Calculator then combines
and matches the task list and stage list with data partition status
list to output the (source, destination, flow_size)
for each flow. Note that task failures will cause corresponding
data partitions to be transmitted again. FLOWPROPHET handles
task failures as follows: Data Trackers receives task failure
events from the DCF worker and notify Flow Calculator of the
extra flow information. Further, the Flow Calculator obtains the
establish_time by a heuristic algorithm.
C. FLOWPROPHET Implementation
We now describe the implementation of the 4 modules of
FlowProphet in detail. We implement FLOWPROPHET with
Scala 2.10.4. We apply the actor model based on Akka 2.3.4
framework [25], which enables each FLOWPROPHET module
to communicate asynchronously and concurrently at low over-
head. Besides, to export DCF intrinsic information, we have
also implemented the APIs for the master and workers of Spark
1.0.0 and Hadoop 0.20.2.
Event Definition Trigger Condition
newStageEvent(stageID, childStageID) a new stage is created
stageStartEvent(List[task], stageID) a stage is beginning
stageFinishedEvent(stageID) a stage is finished
TABLE I: The required APIs for DCF master.
DAG Builder: The DAG Builder relies on the information
provided by DCFs to build a full DAG. DCF developers only
need to develop a set of simple interfaces providing primitive
events, which are outlined in Table I. Similar to the DAG
Builder, the Data Tracker also calls for notification of events
from the DCF worker.
DAGBuilder Handlers
newStageHandler(newStageEvent) (currentStage, childStage)
stageStartHandler(stageStartEvent) Event(List[task], List[stageID])
stageFinishedHandler(stageFinishedEvent) Event(stageID)
TABLE II: The DAG Builder event handlers.
When a new stage is created in DCF, a newStageEvent
will be raised. The DAG Builder obtains the new stage ID and
its child stage ID. By handlers defined in Table II, the DAG
Builder constructs a full DAG from all the collected pairs of
parent and child stages.

DCFs process stages in a depth-first-traversal order, and
traffic does not always take place between two consecutive
stages. To provide accurate prediction, it is necessary to check
the data dependency between current stage and next stage. For
example, in Figure 1 job #n, traffic only happens at following
three moments: after stage 2 and stage 3 both complete, after
stage 5 completes, and after stage 1 and stage 4 both complete.
Furthermore, the stageStartEvent contains a list of
tasks and the stage ID. In each task, the executor_ID is
where the task to be executed; the partition_ID indicates
the data partition that the task will fetch; the func is a set of
nested procedures, which could be executed independently.
Data Aggregator: To manage all the Data Trackers, we place
a Data Aggregator on the master, which organizes partition
status from Data Trackers and exports a query interface for
the Flow Calculator (Table III).
DataAggregator Methods Caller
query(List[partitionID, stageID]) List[(location, size)] FlowCalculator
TABLE III: The Data Aggregator API.
When the Data Aggregator receives a stage ID from the
DAG Builder, it will broadcast the stage ID to all the Data
Trackers. Each Data Tracker then replies with a list of data
partition status for the stage ID. Then the Data Aggregator
will build a HashMap to cache these data partition status with
the stage ID as the key. Besides, the Data Aggregator will
append each data partition status with a location field, which
is the IP address or hostname of the worker that keeps the data
partition.
In DCFs, there could be thousands of workers or more,
which means that there are the same number of Data Trackers.
Leveraging the Actor Model, all the messages sent from
the Data Trackers actors are placed in the mailbox of the
Data Aggregator actor. Then the Data Aggregator processes
messages in an asynchronous, non-blocking way.
Once the Data Aggregator receives a query request from
the Flow Calculator, it will reply with a list of location and
size for each data partition matching the stage ID.
Data Tracker: Similar with the DAG Builder relying on
primitive information from the DCF master, a Data Tracker
receives and records event messages from the DCF worker.
The event message is defined in Table IV.
Event Definition Trigger Condition
taskFailureEvent(taskID, stageID, partitionID) a task is failed
taskFinishedEvent(stageID, partitionID, size) a task is finished
TABLE IV: The required APIs for DCF worker.
The computation takes place on each worker in DCFs,
i.e., the func encapsulated by each task will be extracted
and executed by executors. In general, the computation results
will be written back to the local disk (e.g., Hadoop), or for
high performance, in local memory (e.g., Spark). Besides, most
DCFs designed to be fault-tolerant, and they only attempt re-
execution of failed tasks for limited times. To predict extra
flows generated by tasks re-execution, Data Tracker needs to be
notified of failed tasks. It is simple to implement the required
APIs by adding less than 50 lines of code in DCF task life-
cycle context.
The Data Tracker constructs a HashMap with the stage
ID as the key, and a list of partition IDs and sizes as the
value. The Data Tracker will update the HashMap when the
taskFinishedEvent is raised by the DCF interface. Then,
when the Data Aggregator requests status of data partitions
of a stage ID, the Data Tracker then replies with a list, in
which each piece of data partition is recorded as stage_ID,
partition_ID and size.
DataTracker Methods Caller
query(stageID) List[(stageID, partitionID, size)] DataAggregator
TABLE V: The Data Tracker API.
Flow Calculator: The Flow Calculator is the converging point
of knowledge on time dependency and data dependency, and
it calculates the flow information (source, destination,
flow_size), and estimates the flow establish_time.
Flow information: Once the DAG Builder captures the
stageStartEvent, it will deliver two lists to the Flow
Calculator. One list contains the tasks that are just starting, the
other contains all the parent stage IDs. By traversing the list
of tasks, the Flow Calculator queries the Data Aggregator for
the location and size related to a data partition that each task
will fetch. Thus, the location of data partition is the source,
the executor_ID indicates the destination, and the
size of data partition is the traffic volume flow_size. Since
the predicted flows will not take place until all the tasks on
the master are delivered to the designated workers, the Flow
Calculator will most likely export flow information in advance.
As is shown in our experiments in Section IV, FLOWPROPHET
can predict flow information strictly ahead of time.
Flow establish time: FLOWPROPHET is able to calculate
flow information of the next stage ahead of time. After the
current stage is completed, DCFs usually do a relatively fixed
number of operations to start the next stage, and we refer this
period of time as flow establish_time. For a specific
application, the establish_time is likely to fall within
a range. This is confirmed by our experiments (Figure 7),
which establish_times all exhibit heavy-tailed distribu-
tion in different DCFs. The majority of establish_times
concentrate in the small range with some occasional outliers
(e.g. network congestion).
However, different configurations of DCFs and applications
may result in different the establish_times, and it is
difficult to accurately predict for all DCFs and all applica-
tions. Therefore, we introduce an adaptive algorithm to infer
establish_time of flows of different applications.
For an application, the algorithm tracks the average
and variance of establish_time of the previous flows
via the exponentially weighted moving average (EWMA)
method [26]. EWMA has less lag than naive moving average
method, and is more sensitive to recent establish_times,
which fits our goal of tracking current applications. We de-
scribe the estimation method as follows:
Let t
i
be the expected establish time in the ith stage and σ
the standard deviation. It follows that the establish_time

Citations
More filters
Journal ArticleDOI

An Efficient Online Algorithm for Dynamic SDN Controller Assignment in Data Center Networks

TL;DR: This work proposes a hierarchical two-phase algorithm that integrates key concepts from both matching theory and coalitional games to solve the dynamic controller assignment problem efficiently and proves that the algorithm converges to a near-optimal Nash stable solution within tens of iterations.
Journal ArticleDOI

Stochastic Configuration Networks Based Adaptive Storage Replica Management for Power Big Data Processing

TL;DR: A novel adaptive power storage replica management system, named PARMS, based on stochastic configuration networks (SCNs), in which the network traffic and the data center (DC) geodistribution are taken into consideration to improve data real-time processing.
Proceedings ArticleDOI

Multi-resource Load Balancing for Virtual Network Functions

TL;DR: This paper proposes an efficient algorithm to solve the multi-resource load balancing problem in NFV by first proposing dominant load-the load of the most stressed resource on a server-as the load balancing metric and formulate the MRLB problem as an optimization to minimize the maximum dominant load of all NFV servers given the demand.
Proceedings ArticleDOI

Adaptive scheduling of parallel jobs in spark streaming

TL;DR: A-scheduler is proposed, an adaptive scheduling approach that dynamically schedules parallel micro-batch jobs in Spark Streaming and automatically adjusts scheduling parameters to improve performance and resource efficiency and is implemented and evaluated with real-time security event processing workload.

Proceedings of the 1983 ACM SIGMOD international conference on Management of data

TL;DR: The 1983 ACM International Conference on Management of Data (SIGMOD) as discussed by the authors was held in San Jose, California, from May 24-26, 1983, where 26 foreign and 77 U.S. papers were submitted.
References
More filters
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Proceedings Article

Spark: cluster computing with working sets

TL;DR: Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.
Proceedings ArticleDOI

Pregel: a system for large-scale graph processing

TL;DR: A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.
Proceedings ArticleDOI

Dryad: distributed data-parallel programs from sequential building blocks

TL;DR: The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.
Related Papers (5)
Frequently Asked Questions (14)
Q1. What are the contributions in "Flowprophet: generic and accurate traffic prediction for data-parallel cluster computing" ?

In this paper, the authors design and implement FLOWPROPHET, a general framework to predict traffic flows for DCFs. To this end, the authors analyze and summarize the common features of popular DCFs, and gain a key insight: since application logic in DCFs is naturally expressed by directed acyclic graphs ( DAG ), DAG contains necessary time and data dependencies for accurate flow prediction. The authors also provide generic programming interface to FLOWPROPHET, so that current and future DCFs can deploy FLOWPROPHET readily. The authors implement FLOWPROPHET on both Spark and Hadoop, and perform extensive evaluations on a testbed with 37 physical servers. 

The authors make sure that the application programming interfaces ( APIs ) of FLOWPROPHET is general, so that existing and future computing frameworks can readily deploy FLOWPROPHET to generate accurate flow predictions. The authors also show that simple network optimizations with aheadof-time flow predictions can provide substantial improvement in application performance. 

To achieve scalable performance, DCFs automatically discover and exploit parallelism from user’s application logic, and distribute parallel computational tasks to every computing node. 

For accurate time measurement in a distributed setting, the authors deploy NTP [28] on the master node and worker nodes to synchronize system clock. 

The Data Tracker receives event messages from DCF worker interfaces and maintains a data structure to record all data partition status. 

FLOWPROPHET handles task failures as follows: Data Trackers receives task failure events from the DCF worker and notify Flow Calculator of the extra flow information. 

In addition, as Hadoop spends much more timeto read and write data from disk while Spark visits data in memory directly, FLOWPROPHET manages to achieve larger lead time on Hadoop. 

Since the predicted flows will not take place until all the tasks on the master are delivered to the designated workers, the Flow Calculator will most likely export flow information in advance. 

The Flow Calculator then combines and matches the task list and stage list with data partition status list to output the (source, destination, flow_size) for each flow. 

To enable FLOWPROPHET in a multi-tenant cluster, the authors plan to extend the argument lists of FLOWPROPHET APIs with user IDs (stages, jobs, tasks, and flows will be tagged with a user ID). 

FLOWPROPHET offers simple, flexible, and fine grained interfaces to predict flow information, and they are able to adapt to a wide range of scenarios. 

In Figure 14 and Figure 15, as the number of workers increases, more parallel computing resources are utilized and therefore the job completion time decreases gradually. 

The authors conclude that FLOWPROPHET achieves high (almost 100%) accuracy in source, destination and flow size predictions for both Spark and Hadoop. 

After the current stage is completed, DCFs usually do a relatively fixed number of operations to start the next stage, and the authors refer this period of time as flow establish_time.