scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Load-aware shedding in stream processing systems

TL;DR: This paper provides a theoretical analysis proving that LAS is an (ε, δ)-approximation of the optimal online load shedder and shows its performance through a practical evaluation based both on simulations and on a running prototype.
Abstract: Load shedding is a technique employed by stream processing systems to handle unpredictable spikes in the input load whenever available computing resources are not adequately provisioned. A load shedder drops tuples to keep the input load below a critical threshold and thus avoid tuple queuing and system trashing. In this paper we propose Load-Aware Shedding (LAS), a novel load shedding solution that drops tuples with the aim of maintaining queuing times below a tunable threshold. Tuple execution durations are estimated at runtime using efficient sketch data structures. We provide a theoretical analysis proving that LAS is an (e, δ)-approximation of the optimal online load shedder and show its performance through a practical evaluation based both on simulations and on a running prototype.

Summary (3 min read)

1. INTRODUCTION

  • Distributed stream processing systems (DSPS) are today considered as a mainstream technology to build architectures for the real-time analysis of big data.
  • This latter aspect is often critical, as input data streams may unpredictably change over time both in rate and content.
  • Existing load shedding solutions either randomly drop tuples when bottlenecks are detected or apply a pre-defined model of the application and its input that allows them to deterministically take the best shedding decision.
  • The tuple execution duration, in fact, may depend on the tuple content itself.
  • Afterwards, Section 3 details LAS whose behavior is then theoretically analyzed in Section 4.

2. SYSTEM MODEL AND PROBLEM DEFINITION

  • The authors consider a distributed stream processing system (DSPS) deployed on a cluster where several computing nodes exchange data through messages sent over a network.
  • Data injected by source operators is encapsulated in units called tuples and each data stream is an unbounded sequence of tuples.
  • Without loss of generality, here the authors assume that each tuple t is a finite set of key/value pairs that can be customized to represent complex data structures.
  • On the other hand, the input throughput of the stream may vary, even with a large magnitude, at any time.
  • The goal of the load shedder is to maintain the average queuing latency smaller than a given threshold τ by dropping as less tuples as possible while the stream unfolds.

3.1 Overview

  • Load-Aware Shedding (LAS) is based on a simple, yet effective, idea: if the authors assume to know the execution duration w(t) of each tuple t in the operator, then they can foresee queuing times and drop all tuples that will cause the queuing latency threshold τ to be violated.
  • The value of w(t) is generally unknown.
  • Then, it computes the sum of the estimated execution durations of the tuples assigned to the operator, i.e., Ĉ = ∑ i∈[m]\D ŵ(t).
  • To enable this approach, LAS builds a sketch on the operator (i.e., a memory efficient data structure) that will track the execution duration of the tuples it processes.
  • This solution does not require any a priori knowledge on the stream or system, and is designed to continuously adapt to changes in the input stream or on the operator characteristics.

3.2 LAS design

  • The operator maintains two Count Min [4] sketch matrices : the first one, denoted as F , tracks the tuple frequencies ft; the second one, denoted as W, tracks the tuples cumulated execution durations.
  • In the positive case the operator sends the F andW matrices to the load shedder , resets their content and moves back to the START state .
  • While being in the SEND states, LS sends to O the current cumulated execution duration estimation Ĉ piggy backing it with the first tuple t that is not dropped (Listing 3.2 lines 24-26) and moves in the RUN state .
  • It then checks if the estimated queuing latency for t satisfies the Check method (Listing 3.2 lines 19-21).

4. THEORETICAL ANALYSIS

  • Data streaming algorithms strongly rely on pseudo-random functions that map elements of the stream to uniformly distributed image values to keep the essential information of the input stream, regardless of the stream elements frequency distribution.
  • First the authors study the correctness and optimality of the shedding algorithm, under full knowledge assumption (i.e., the shedding strategy is aware of the exact execution duration wt for each tuple t).
  • The proofs of these equations as well as some numerical applications to illustrate the accuracy are discussed in [8].
  • Then, according to Theorem 4.1, LAS is an (ε, δ)-optimal algorithm for load shedding, as defined in Problem 2.1, over all possible data streams σ.

5. EXPERIMENTAL EVALUATION

  • In this section the authors evaluate the performance obtained by using LAS to perform load shedding.
  • The authors will first describe the general setting used to run the tests and will then discuss the results obtained through simulations and with a prototype of LAS integrated within Apache Storm.
  • Due to space constraints, the exhaustive presentation of these experiments are available in the companion paper [8].

5.1 Setup

  • In their tests the authors consider both synthetic and real datasets.
  • Conversely, an input throughput larger than 1/W will result in an under-provisioned system.
  • In order to generate 100 different streams, the authors randomize the association between the wn execution duration values and the n distinct items: for each of the wn execution duration values they pick uniformly at random n/wn different values in [n] that will be associated to that execution duration value.
  • This means that each single experiment reports the mean outcome of 5, 000 independent runs.
  • Among other information, the tweets are enriched with a field mention containing the entities mentioned in the tweet.

5.2 Simulation Results

  • In this section the authors analyze through simulations the sensitivity of LAS while varying several characteristics of the input load.
  • As expected, in this latter case all algorithms perform at the same level as load shedding is superfluous.
  • At the beginning of this phase both Straw-Man and LAS perform bad, with queuing latencies that are largely above τ .
  • While the phase unfolds LAS quickly updates its data structures and converges toward the given threshold, while Straw-Man diverges as tuples continue to be enqueued on the operator worsening the bottleneck effect.

5.3 Prototype

  • The source reads from the dataset and emits the tuples consumed by bolt LS.
  • The authors assumed that this leads to a long execution duration for media (e.g., possibly caused by an access to an external DB to gather historical data), an average execution duration for politicians and a fast execution duration for others (e.g., possibly because these tweets are not decorated).
  • Figure 7 reports the average completion latency as the stream unfolds.
  • Conversely, StrawMan completion latencies are at least one order of magnitude larger.
  • These results confirm the effectiveness of LAS in keeping a close control on queuing latencies (and thus provide more predictable performance) at the cost of dropping a fraction of the input load.

7. CONCLUSIONS

  • A novel solution for load shedding in DSPS.the authors.
  • LAS is based on the observation that load on operators depends both on the input rate and on the content of tuplesLAS leverages sketch data structures to efficiently collect at runtime information on the operator load characteristics and then use this information to implement a load shedding policy aimed at maintaining the average queuing latencies close to a given threshold.
  • Furthermore, tests conducted both on a simulated environment and on a prototype implementation confirm that by taking into account the specific load imposed by each tuple, LAS can provide performance that closely approach a given target, while dropping a limited number of tuples.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

HAL Id: hal-01413212
https://hal.inria.fr/hal-01413212
Submitted on 12 Dec 2016
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Load-aware shedding in stream processing systems
Nicoló Rivetti, Yann Busnel, Leonardo Querzoni
To cite this version:
Nicoló Rivetti, Yann Busnel, Leonardo Querzoni. Load-aware shedding in stream processing systems.
10th ACM International Conference on Distributed and Event-based Systems (DEBS), Jun 2016,
Irvine, United States. pp.61 - 68, �10.1145/2933267.2933311�. �hal-01413212�

Load-Aware Shedding in Stream Processing Systems
Nicoló Rivetti
LINA / Université de Nantes,
France
DIAG / Sapienza University of
Rome, Italy
rivetti@dis.uniroma1.it
Yann Busnel
Crest (Ensai) / Inria
Rennes, France
yann.busnel@ensai.fr
Leonardo Querzoni
DIAG / Sapienza University of
Rome, Italy
querzoni@dis.uniroma1.it
ABSTRACT
Load shedding is a technique employed by stream process-
ing systems to handle unpredictable spikes in the input load
whenever available computing resources are not adequately
provisioned. A load shedder drops tuples to keep the input
load below a critical threshold and thus avoid tuple queuing
and system trashing. In this paper we propose Load-Aware
Shedding (LAS), a novel load shedding solution that drops
tuples with the aim of maintaining queuing times below a
tunable threshold. Tuple execution durations are estimated
at runtime using efficient sketch data structures. We pro-
vide a theoretical analysis proving that LAS is an (ε, δ)-
approximation of the optimal online load shedder and show
its performance through a practical evaluation based both
on simulations and on a running prototype.
CCS Concepts
Software and its engineering Distributed systems
organizing principles; Theory of computation On-
line learning algorithms; Sketching and sampling;
Keywords
Stream Processing, Data Streaming, Load Shedding
1. INTRODUCTION
Distributed stream processing systems (DSPS) are today
considered as a mainstream technology to build architec-
tures for the real-time analysis of big data. An application
running in a DSPS is typically modeled as a directed acyclic
graph where data operators (nodes) are interconnected by
streams of tuples containing data to be analyzed (edges).
The success of such systems can be traced back to their
ability to run complex applications at scale on clusters of
commodity hardware.
Correctly provisioning computing resources for DSPS how-
ever is far from being a trivial task. System designers need
Accepted to the 10th ACM International Conference on Distributed and Event-based
Systems (DEBS ’16)
to take into account several factors: the computational com-
plexity of the operators, the overhead induced by the frame-
work, and the characteristics of the input streams. This
latter aspect is often critical, as input data streams may
unpredictably change over time both in rate and content.
Bursty input load represents a problem for DSPS as it may
create unpredictable bottlenecks within the system that lead
to an increase in queuing latencies, pushing the system in a
state where it cannot deliver the expected quality of service
(typically expressed in terms of tuple completion latency).
Load shedding is generally considered a practical approach
to handle bursty traffic. It consists in dropping a subset of
incoming tuples as soon as a bottleneck is detected in the
system.
Existing load shedding solutions either randomly drop tu-
ples when bottlenecks are detected or apply a pre-defined
model of the application and its input that allows them to
deterministically take the best shedding decision. In any
case, all the existing solutions assume that incoming tuples
all impose the same computational load on the DSPS. How-
ever, such assumption does not hold for many practical use
cases. The tuple execution duration, in fact, may depend on
the tuple content itself. This is often the case whenever the
receiving operator implements a logic with branches where
only a subset of the incoming tuples travels through each sin-
gle branch. If the computation associated with each branch
generates different loads, then the execution duration will
change from tuple to tuple. A tuple with a large execution
duration may delay the execution of subsequent tuples in the
same stream, thus increasing queuing latencies and possibly
cause the emergence of a bottleneck.
On the basis of this simple observation, we introduce Load-
Aware Shedding (LAS), a novel solution for load shedding in
DSPS. LAS gets rid of the aforementioned assumptions and
provides efficient shedding aimed at matching given queuing
latency goals, while dropping as few tuples as possible. To
reach this goal LAS leverages a smart combination of sketch
data structures to efficiently collect at runtime information
on the time needed to compute tuples and thus build and
maintain a cost model that is then exploited to take deci-
sions on when load must be shed. LAS has been designed as
a flexible solution that can be applied on a per-operator ba-
sis, thus allowing developers to target specific critical stream
paths in their applications.
In summary, the contributions provided by this paper are
(i) the introduction of LAS, the first solution for load shed-
ding in DSPS that proactively drops tuples to avoid bottle-
necks without requiring a predefined cost model and without

any assumption on the distribution of tuples, (ii) a theo-
retical analysis of LAS that points out how it is an (, δ)-
approximation of the optimal online shedding algorithm and,
finally, (iii) an experimental evaluation that illustrates how
LAS can provide predictable queuing latencies that approx-
imate a given threshold while dropping a small fraction of
the incoming tuples.
Below, the next section states the system model we con-
sider. Afterwards, Section 3 details LAS whose behavior is
then theoretically analyzed in Section 4. Section 5 reports
on our experimental evaluation and Section 6 analyzes the
related works. Finally Section 7 concludes the paper.
2. SYSTEM MODEL AND PROBLEM DEF-
INITION
We consider a distributed stream processing system (DSPS)
deployed on a cluster where several computing nodes ex-
change data through messages sent over a network. The
DSPS executes a stream processing application represented
by a topology: a directed acyclic graph interconnecting op-
erators, represented by vertices, with data streams (DS),
represented by edges.
Data injected by source operators is encapsulated in units
called tuples and each data stream is an unbounded sequence
of tuples. Without loss of generality, here we assume that
each tuple t is a finite set of key/value pairs that can be
customized to represent complex data structures. To sim-
plify the discussion, in the rest of this work we deal with
streams of unary tuples each representing a single non nega-
tive integer value. We also restrict our model to a topology
with an operator LS (load shedder) that decides which tu-
ples of its outbound DS σ consumed by operator O shall
be dropped. Tuples in σ are drawn from a large universe
[n] = {1, . . . , n} and are ordered, i.e., σ = ht
1
, . . . , t
m
i.
Therefore [m] = 1, . . . , m is the index sequence associated
with the m tuples contained in the stream σ. Both m and
n are unknown. We denote with f
t
the unknown frequency
of tuple t, i.e., the number of occurrences of t in σ.
We assume that the execution duration of tuple t on oper-
ator O, denoted as w(t), depends on the content of t. In par-
ticular, without loss of generality, we consider a case where
w depends on a single, fixed and known attribute value of
t. The probability distribution of such attribute values, as
well as w, are unknown, may differ from operator to oper-
ator and may change over time. However, we assume that
subsequent changes are interleaved by a large enough time
frame such that an algorithm may have a reasonable amount
of time to adapt. On the other hand, the input throughput
of the stream may vary, even with a large magnitude, at any
time.
Let q(i) be the queuing latency of the i-th tuple of the
stream, i.e., the time spent by the i-th tuple in the inbound
buffer of operator O before being processed. Let us denote
as D [m], the set of dropped tuples in a stream of length
m, i.e., dropped tuples are thus represented in D by their
indices in the stream [m]. Moreover, let d m be the
number of dropped tuples in a stream of length m, i.e.,
d = |D|. We can define the average queuing latency as:
Q(j) =
P
i[j]\D
q(i)/(j d) for all j [m].
The goal of the load shedder is to maintain the average
queuing latency smaller than a given threshold τ by drop-
ping as less tuples as possible while the stream unfolds. The
quality of the shedder can be evaluated both by comparing
the resulting Q against τ and by measuring the number of
dropped tuples d. More formally, the load shedding problem
can be defined as follows
1
:
Problem 2.1 (Load Shedding). Given a data stream
σ = ht
1
, . . . , t
m
i, find the smallest set D such that
j [m] \ D, Q(j) τ.
3. LOAD AWARE SHEDDING
3.1 Overview
Load-Aware Shedding (LAS) is based on a simple, yet
effective, idea: if we assume to know the execution duration
w(t) of each tuple t in the operator, then we can foresee
queuing times and drop all tuples that will cause the queuing
latency threshold τ to be violated. However, the value of
w(t) is generally unknown.
LAS builds and maintain at run-time a cost model for tu-
ple execution durations. It takes shedding decision based on
the estimation
b
C of the total execution duration of the oper-
ator: C =
P
i[m]\D
w(t
i
). In order to do so, LAS computes
an estimation ˆw(t) of the execution duration w(t) of each
tuple t. Then, it computes the sum of the estimated exe-
cution durations of the tuples assigned to the operator, i.e.,
b
C =
P
i[m]\D
ˆw(t). At the arrival of the i-th tuple, subtract-
ing from
b
C the (physical) time elapsed from the emission of
the first tuple provides LAS with an estimation ˆq(i) of the
queuing latency q(i) for the current tuple.
To enable this approach, LAS builds a sketch on the oper-
ator (i.e., a memory efficient data structure) that will track
the execution duration of the tuples it processes. When
a change in the stream or operator characteristics affects
the tuples execution durations w(t), i.e., the sketch con-
tent changes, the operator will forward an updated version
to the load shedder, which will than be able to (again) cor-
rectly estimate the tuples execution durations. This solution
does not require any a priori knowledge on the stream or
system, and is designed to continuously adapt to changes in
the input stream or on the operator characteristics.
3.2 LAS design
The operator maintains two Count Min [4] sketch matrices
(Figure 1.A): the first one, denoted as F, tracks the tuple
frequencies f
t
; the second one, denoted as W, tracks the tu-
ples cumulated execution durations W
t
= w(t) × f
t
. Both
Count Min matrices share the same sizes and 2-universal
hash functions [3]. The latter is a generalized version of the
Count Min providing (ε, δ)-additive-approximation of point
queries on stream of updates whose value is the tuple execu-
tion duration when processed by the instance. The operator
will update (Listing 3.1 lines 27-30) both matrices after each
tuple execution.
The operator is modeled as a finite state machine (Fig-
ure 2) with two states: START and STABILIZING. The
START state lasts as long as the operator has executed N
tuples, where N is a user defined window size parameter.
The transition to the STABILIZING state (Figure 2.A)
1
This is not the only possible definition of the load shedding
problems. Other variants are briefly discussed in section 6.

c
1 2 3 4
r
2
1
F
c
1 2 3 4
W
hF, Wi
O
LAS
b
C
hF, Wi
LS
htuplei | htuple,
b
Ci
hF, Wi
hi
A
B
C
D
E
Figure 1: Load-Aware Shedding design with r = 2
(δ = 0.25), c = 4 (ε = 0.70).
triggers the creation of a new snapshot S. A snapshot is
a matrix of size r × c where i [r], j [c] : S[i, j] =
W[i, j]/F[i, j] (Listing 3.1 lines 15-17). We say that the F
and W matrices are stable when the relative error η between
the previous snapshot and the current one is smaller than a
configurable parameter µ, i.e.,
η =
P
i,j
|S[i, j]
W[i,j]
F[i,j])
|
P
i,j
S[i, j]
µ (1)
is satisfied. Then, each time the operator has executed N tu-
ples (Listing 3.1 lines 18-25), it checks whether Equation 1 is
satisfied. (i) In the negative case S is updated (Figure 2.B).
(ii) In the positive case the operator sends the F and W ma-
trices to the load shedder (Figure 1.B), resets their content
and moves back to the START state (Figure 2.C).
There is a delay between any change in w(t) and when LS
receives the updated F and W matrices. This introduces a
skew in the cumulated execution duration estimated by LS.
In order to compensate this skew, we introduce a synchro-
nization mechanism that kicks in whenever the LS receives
a new pair of matrices from the operator.
The LS (Figure 1.C) maintains the estimated cumulated
execution duration of the operator
b
C and a pairs of initially
empty matrices hF, Wi. LS is modeled as a finite state ma-
chine (Figure 3) with three states: NOP, SEND and RUN.
The LS executes the code reported in Listing 3.2. In partic-
ular, every time a new tuple t arrives at the LS, the function
shed is executed. The LS starts in the NOP state where
no action is performed (Listing 3.2 lines 15-17). Here we
assume that in this initial phase, i.e., when the topology
has just been deployed, no load shedding is required. When
LS receives the first pair hF, Wi of matrices (Figure 3.A),
it moves into the SEND state and updates its local pair of
matrices (Listing 3.2 lines 7-9). While being in the SEND
states, LS sends to O the current cumulated execution du-
ration estimation
b
C (Figure 1.D) piggy backing it with the
first tuple t that is not dropped (Listing 3.2 lines 24-26) and
moves in the RUN state (Figure 3.B). This informations is
used to synchronize the LS with O and remove the skew
between O’s cumulated execution duration C and the esti-
mation
b
C at LS. O replies to this request (Figure 1.E) with
the difference = C
b
C (Listing 3.1 lines 11-13). When the
load shedder receives the synchronization reply (Figure 3.C)
it updates its estimation
b
C + (Listing 3.2 lines 11-13).
Listing 3.1: Operator
1: init do
2: F 0
r,c
zero matrices of size r × c
3: W 0
r,c
4: S 0
r,c
5: r hash functions h
1
, . . . , h
r
: [n] [c] from a 2-universal
family.
6: m 0
7: state Start
8: end init
9: function update(tuple : t, execution time : l, request :
b
C)
10: m m + 1
11: if
b
C not null then
12: C
b
C
13: send hi to LS
14: end if
15: if state = Start m mod N = 0 then Figure 2.A
16: update S
17: state Stabilizing
18: else if state = Stabilizing m mod N = 0 then
19: if η µ (Eq. 1) then Figure 2.C
20: send hF, Wi to LS
21: state Start
22: reset F and W to 0
r,c
23: else Figure 2.B
24: update S
25: end if
26: end if
27: for i = 1 to r do
28: F[i, h
i
(t)] F[i, h
i
(t)] + 1
29: W[i, h
i
(t)] W[i, h
i
(t)] + l
30: end for
31: end function
start stabilizing
execute N tuples
create snapshot S
execute N tuples relative error η µ
send F and W to scheduler and reset them
execute N tuples
relative error η > µ
update snapshot S
A
B
C
Figure 2: Operator finite state machine.
In the RUN state, the load shedder computes, for each
tuple t, the estimated queuing latency ˆq(i) as the difference
between the operator estimated execution duration
b
C and
the time elapsed from the emission of the first tuple (List-
ing 3.2 line 18). It then checks if the estimated queuing
latency for t satisfies the Check method (Listing 3.2 lines
19-21).
This method encapsulates the logic for checking if a de-
sired condition on queuing latencies is violated or not. In
this paper, as stated in Section 2, we aim at maintaining the
average queuing latency below a threshold τ. Then, Check
tries to add ˆq to the current average queuing latency (List-
ing 3.2 lines 31). If the result is larger than τ (i), it simply
returns true; otherwise (ii), it updates its local value for
the average queuing latency and returns f alse (Listing 3.2
lines 34-36). Note that different goals, based on the queuing
latency, can be defined and encapsulated within Check.
If Check(ˆq) returns true, (i) the load shedder returns
true as well, i.e., tuple t must be dropped. Otherwise (ii),
the operator estimated execution duration
b
C is updated with

Send RUN
NOP
synhcronization request
sent
received reply
resynchronize
b
C
received F and W
update local F and W
A
B
C
D
Figure 3: Load shedder LS finite state machine.
the estimated tuple execution duration ˆw(t), increased by a
factor 1 + ε to mitigate potential under-estimations
2
, and
the load shedder returns false (Listing 3.2 line 28), i.e.,
the tuple must not be dropped. Finally, if the load shed-
der receives a new pair hF, Wi of matrices (Figure 3.D), it
will again update its local pair of matrices and move to the
SEND state (Listing 3.2 lines 7-9).
Now we will briefly discuss the complexity of LAS.
Theorem 3.1 (Time complexity of LAS).
For each tuple read from the input stream, the time complex-
ity of LAS for the operator and the load shedder is O(log 1).
Theorem 3.2 (Space Complexity of LAS).
The space complexity of LAS for the operator and load shed-
der is O
1
ε
log
1
δ
(log m + log n)
bits.
Theorem 3.3 (Communication complexity of LAS).
The communication complexity of LAS is of O
m
N
messages
and O
m
N
1
ε
log
1
δ
(log m + log n) + log m

bits.
Note that the communication cost is low with respect to
the stream size since the window size N should be chosen
such that N 1 (e.g., in our tests we have N = 1024).
Proofs for the previous theorems are available in [8].
4. THEORETICAL ANALYSIS
Data streaming algorithms strongly rely on pseudo-random
functions that map elements of the stream to uniformly dis-
tributed image values to keep the essential information of the
input stream, regardless of the stream elements frequency
distribution. Here we provide the analysis of the quality
of the shedding performed by LAS in two steps. First we
study the correctness and optimality of the shedding algo-
rithm, under full knowledge assumption (i.e., the shedding
strategy is aware of the exact execution duration w
t
for each
tuple t). For the sake of clarity, and to abide to space limits,
all the proofs are available in a companion paper [8].
We suppose that tuples cannot be preempted, i.e. they
must be processed in an uninterrupted fashion on the avail-
able operator instance. Given our system model, here we
consider the problem of minimizing d, the number of dropped
tuples, while guaranteeing that the average queuing latency
Q(t) will be upper-bounded by τ, t σ. The solution must
work online, thus the decision of enqueueing or dropping a
2
This correction factor derives from the fact that ˆw(t) is a
(ε, δ)-approximation of w(t) as shown in Section 4.
Listing 3.2: Load shedder
1: init do
2:
b
C 0
3: hF, Wi h0
r,c
, 0
r,c
i zero matrices pair of size r × c
4: Same hash functions h
1
. . . h
r
of the operator
5: state NOP
6: end init
7: upon hF
0
, W
0
i do Figure 3.A and 3.D
8: state Send
9: hF, Wi hF
0
, W
0
i
10: end upon
11: upon hi do Figure 3.C
12:
b
C
b
C +
13: end upon
14: function shed(tuple : t)
15: if state = NOP then
16: return false
17: end if
18: ˆq
b
C elapsed time from first tuple
19: if Check(ˆq) then
20: return true
21: end if
22: i arg min
i[r]
{F[i, h
i
(t)]}
23:
b
C
b
C + (W[i, h
i
(t)]/F[i, h
i
(t)]) × (1 + ε)
24: if state = Send then Figure 3.B
25: piggy back
b
C to operator on t
26: state Run
27: end if
28: return false
29: end function
30: function Check(q)
31: if Q/ > τ then
32: return true
33: end if
34: Q Q + q
35: + 1
36: return false
37: end function
tuple has to be made only resorting to knowledge about tu-
ples received so far in the stream.
Let OPT be the online algorithm that provides the opti-
mal solution to Problem 2.1. We denote with D
σ
OP T
(resp.
d
σ
OP T
) the set of dropped tuple indices (resp. the number
of dropped tuples) produced by the OPT algorithm fed by
stream σ (cf., Section 2). We also denote with d
σ
LAS
the
number of dropped tuples produced by LAS introduced in
Section 3.2 fed with the same stream σ.
Theorem 4.1 (LAS Correctness & Optimality).
For any σ, we have d
σ
LAS
= d
σ
OP T
and t σ,
Q
σ
LAS
(t) τ.
Now we remove the previous assumptions and analyze the
approximation made on execution duration w(t) for each
tuple t. LAS uses two matrices, F and W, to estimate the
execution time w(t) of each tuple submitted to the operator.
By the Count Min sketch algorithm [4] and Listing 3.1, we
have that for any t [n] and for each row i [r],
F[i][h
i
(t)] = f
t
+
n
X
u=1,u6=t
f
u
1
{h
i
(u)=h
i
(t)}
.
and
W[i][h
i
(t)] = f
t
w(t) +
n
X
u=1,u6=t
f
u
w(u)1
{h
i
(u)=h
i
(t)}
,

Citations
More filters
01 Dec 2004
TL;DR: In this paper, the authors introduce a sublinear space data structure called the countmin sketch for summarizing data streams, which allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc.
Abstract: We introduce a new sublinear space data structure--the count-min sketch--for summarizing data streams. Our sketch allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition, it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc. The time and space bounds we show for using the CM sketch to solve these problems significantly improve those previously known--typically from 1/e2 to 1/e in factor.

65 citations

Proceedings ArticleDOI
20 Apr 2020
TL;DR: This work introduces a hybrid model that combines both input-based and statebased shedding to achieve high result quality under constrained resources and indicates that such hybrid shedding improves the recall by up to 14× for synthetic data and 11.4× for real-world data, compared to baseline approaches.
Abstract: Complex event processing (CEP) systems that evaluate queries over streams of events may face unpredictable input rates and query selectivities. During short peak times, exhaustive processing is then no longer reasonable, or even infeasible, and systems shall resort to best-effort query evaluation and strive for optimal result quality while staying within a latency bound. In traditional data stream processing, this is achieved by load shedding that discards some stream elements without processing them based on their estimated utility for the query result.We argue that such input-based load shedding is not always suitable for CEP queries. It assumes that the utility of each individual element of a stream can be assessed in isolation. For CEP queries, however, this utility may be highly dynamic: Depending on the presence of partial matches, the impact of discarding a single event can vary drastically. In this work, we therefore complement input-based load shedding with a statebased technique that discards partial matches. We introduce a hybrid model that combines both input-based and statebased shedding to achieve high result quality under constrained resources. Our experiments indicate that such hybrid shedding improves the recall by up to 14× for synthetic data and 11.4× for real-world data, compared to baseline approaches.

20 citations


Cites background from "Load-aware shedding in stream proce..."

  • ...Random shedding strategies have been implemented in current streaming systems, such as Heron [19], and Kafka [7], while shedding may also be guided by queueing latencies [36], concept drift detection [26], and the expected quality of service [33]....

    [...]

Proceedings ArticleDOI
26 Nov 2018
TL;DR: SpinStreams is proposed, a static optimization tool able to leverage cost models that programmers can use to detect and understand the inefficiencies of an initial application design and suggests optimizations for restructuring applications by generating code to be run on the SPS.
Abstract: The ubiquity of data streams in different fields of computing has led to the emergence of Stream Processing Systems (SPSs) used to program applications that extract insights from unbounded sequences of data items. Streaming applications demand various kinds of optimizations. Most of them are aimed at increasing throughput and reducing processing latency, and need cost models used to analyze the steady-state performance by capturing complex aspects like backpressure and bottleneck detection. In those systems, the tendency is to support dynamic optimizations of running applications which, although with a substantial run-time overhead, are unavoidable in case of unpredictable workloads. As an orthogonal direction, this paper proposes SpinStreams, a static optimization tool able to leverage cost models that programmers can use to detect and understand the inefficiencies of an initial application design. SpinStreams suggests optimizations for restructuring applications by generating code to be run on the SPS. We present the theory behind our optimizations, which cover more general classes of application structures than the ones studied in the literature so far. Then, we assess the accuracy of our models in Akka, an actor-based streaming framework providing a Java and Scala API.

14 citations


Cites background from "Load-aware shedding in stream proce..."

  • ...A common alternative is to apply load shedding [33] to prevent the streaming buffers to indefinitely grow by discarding input items....

    [...]

Proceedings ArticleDOI
09 Dec 2019
TL;DR: A load shedding framework called eSPICE is proposed for complex event processing systems that depends on building a probabilistic model that learns about the importance of events in a window and its type are used as features to build the model.
Abstract: Complex event processing systems process the input event streams on-the-fly. Since input event rate could overshoot the system's capabilities and results in violating a defined latency bound, load shedding is used to drop a portion of the input event streams. The crucial question here is how many and which events to drop so the defined latency bound is maintained and the degradation in the quality of results is minimized. In stream processing domain, different load shedding strategies have been proposed but they mainly depend on the importance of individual tuples (events). However, as complex event processing systems perform pattern detection, the importance of events is also influenced by other events in the same pattern. In this paper, we propose a load shedding framework called eSPICE for complex event processing systems. eSPICE depends on building a probabilistic model that learns about the importance of events in a window. The position of an event in a window and its type are used as features to build the model. Further, we provide algorithms to decide when to start dropping events and how many events to drop. Moreover, we extensively evaluate the performance of eSPICE on two real-world datasets.

11 citations


Cites background from "Load-aware shedding in stream proce..."

  • ...Load shedding has been proposed by several research groups [13, 14, 21, 23, 29, 30] in the stream processing domain....

    [...]

References
More filters
Journal ArticleDOI
J. Lawrence Carter1, Mark N. Wegman1
TL;DR: An input independent average linear time algorithm for storage and retrieval on keys that makes a random choice of hash function from a suitable class of hash functions.

2,886 citations


"Load-aware shedding in stream proce..." refers methods in this paper

  • ...[3] J. L. Carter and M. N. Wegman....

    [...]

  • ...Carter and Wegman [3] provide an efficient method to build large families of hash functions approximating the 2-universality property....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors introduce a sublinear space data structure called the countmin sketch for summarizing data streams, which allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc.

1,939 citations

Journal ArticleDOI
TL;DR: Data Streams: Algorithms and Applications surveys the emerging area of algorithms for processing data streams and associated applications, which rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity.
Abstract: In the data stream scenario, input arrives very rapidly and there is limited memory to store the input. Algorithms have to work with one or few passes over the data, space less than linear in the input size or time significantly less than the input size. In the past few years, a new theory has emerged for reasoning about algorithms that work within these constraints on space, time, and number of passes. Some of the methods rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity. The applications for this scenario include IP network traffic analysis, mining text message streams and processing massive data sets in general. Researchers in Theoretical Computer Science, Databases, IP Networking and Computer Systems are working on the data stream challenges. This article is an overview and survey of data stream algorithmics and is an updated version of [1].

1,598 citations

Journal ArticleDOI
01 Aug 2003
TL;DR: The basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications, are described and a stream-oriented set of operators are described.
Abstract: .This paper describes the basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications. Monitoring applications differ substantially from conventional business data processing. The fact that a software system must process and react to continual inputs from many sources (e.g., sensors) rather than from human operators requires one to rethink the fundamental architecture of a DBMS for this application area. In this paper, we present Aurora, a new DBMS currently under construction at Brandeis University, Brown University, and M.I.T. We first provide an overview of the basic Aurora model and architecture and then describe in detail a stream-oriented set of operators.

1,518 citations

Book ChapterDOI
09 Sep 2003
TL;DR: This paper examines a technique for dynamically inserting and removing drop operators into query plans as required by the current load, and addresses the problems of determining when load shedding is needed, where in the query plan to insert drops, and how much of the load should be shed at that point in the plan.
Abstract: A Data Stream Manager accepts push-based inputs from a set of data sources, processes these inputs with respect to a set of standing queries, and produces outputs based on Quality-of-Service (QoS) specifications. When input rates exceed system capacity, the system will become overloaded and latency will deteriorate. Under these conditions, the system will shed load, thus degrading the answer, in order to improve the observed latency of the results. This paper examines a technique for dynamically inserting and removing drop operators into query plans as required by the current load. We examine two types of drops: the first drops a fraction of the tuples in a randomized fashion, and the second drops tuples based on the importance of their content. We address the problems of determining when load shedding is needed, where in the query plan to insert drops, and how much of the load should be shed at that point in the plan. We describe efficient solutions and present experimental evidence that they can bring the system back into the useful operating range with minimal degradation in answer quality.

662 citations


"Load-aware shedding in stream proce..." refers background in this paper

  • ...first introduced in [11] the idea of semantic load shedding....

    [...]

Frequently Asked Questions (9)
Q1. What have the authors contributed in "Load-aware shedding in stream processing systems" ?

In this paper the authors propose Load-Aware Shedding ( LAS ), a novel load shedding solution that drops tuples with the aim of maintaining queuing times below a tunable threshold. The authors provide a theoretical analysis proving that LAS is an ( ε, δ ) approximation of the optimal online load shedder and show its performance through a practical evaluation based both on simulations and on a running prototype. 

By finely tuning the parameter r to log(1/δ), under the assumption of [8], the authors are then able to (ε, δ)-approximate w(t) for any t ∈ [n]. 

Aurora [1] is the first stream processing system where shedding has been proposed as a technique to deal with bursty input traffic. 

Bursty input load represents a problem for DSPS as it may create unpredictable bottlenecks within the system that lead to an increase in queuing latencies, pushing the system in a state where it cannot deliver the expected quality of service (typically expressed in terms of tuple completion latency). 

At the arrival of the i-th tuple, subtracting from Ĉ the (physical) time elapsed from the emission of the first tuple provides LAS with an estimation q̂(i) of the queuing latency q(i) for the current tuple. 

The transition to phase F is extremely abrupt as the input throughput is brought back to the equivalent of 0% of under-provisioning, but the cost to handle each tuple on the operator is doubled. 

The transition to phase C brings the system back in the initial configuration, while in phase D the change in the tuple frequency distribution is managed very differently by each solution: both Full Knowledge and LAS compensate this change by starting to drop more tuples, but still maintaining the average queuing latency close to the desired threshold τ . 

This result stems from the fact that the sketch data structures used to trace tuple execution durations perform at their best on strongly skewed distribution, rather than on uniform ones. 

The authors also restrict their model to a topology with an operator LS (load shedder) that decides which tuples of its outbound DS σ consumed by operator O shall be dropped.