Proceedings Article•DOI•

Load-aware shedding in stream processing systems

Nicolo Rivetti¹, Yann Busnel², Leonardo Querzoni¹•Institutions (2)

Sapienza University of Rome¹, French Institute for Research in Computer Science and Automation²

13 Jun 2016-pp 61-68

TL;DR: This paper provides a theoretical analysis proving that LAS is an (ε, δ)-approximation of the optimal online load shedder and shows its performance through a practical evaluation based both on simulations and on a running prototype.

read less

Abstract: Load shedding is a technique employed by stream processing systems to handle unpredictable spikes in the input load whenever available computing resources are not adequately provisioned. A load shedder drops tuples to keep the input load below a critical threshold and thus avoid tuple queuing and system trashing. In this paper we propose Load-Aware Shedding (LAS), a novel load shedding solution that drops tuples with the aim of maintaining queuing times below a tunable threshold. Tuple execution durations are estimated at runtime using efficient sketch data structures. We provide a theoretical analysis proving that LAS is an (e, δ)-approximation of the optimal online load shedder and show its performance through a practical evaluation based both on simulations and on a running prototype.

...read moreread less

Summary (3 min read)

Jump to: [1. INTRODUCTION] – [2. SYSTEM MODEL AND PROBLEM DEFINITION] – [3.1 Overview] – [3.2 LAS design] – [4. THEORETICAL ANALYSIS] – [5. EXPERIMENTAL EVALUATION] – [5.1 Setup] – [5.2 Simulation Results] – [5.3 Prototype] – [6. RELATED WORK] and [7. CONCLUSIONS]

1. INTRODUCTION

Distributed stream processing systems (DSPS) are today considered as a mainstream technology to build architectures for the real-time analysis of big data.
This latter aspect is often critical, as input data streams may unpredictably change over time both in rate and content.
Existing load shedding solutions either randomly drop tuples when bottlenecks are detected or apply a pre-defined model of the application and its input that allows them to deterministically take the best shedding decision.
The tuple execution duration, in fact, may depend on the tuple content itself.
Afterwards, Section 3 details LAS whose behavior is then theoretically analyzed in Section 4.

2. SYSTEM MODEL AND PROBLEM DEFINITION

The authors consider a distributed stream processing system (DSPS) deployed on a cluster where several computing nodes exchange data through messages sent over a network.
Data injected by source operators is encapsulated in units called tuples and each data stream is an unbounded sequence of tuples.
Without loss of generality, here the authors assume that each tuple t is a finite set of key/value pairs that can be customized to represent complex data structures.
On the other hand, the input throughput of the stream may vary, even with a large magnitude, at any time.
The goal of the load shedder is to maintain the average queuing latency smaller than a given threshold τ by dropping as less tuples as possible while the stream unfolds.

3.1 Overview

Load-Aware Shedding (LAS) is based on a simple, yet effective, idea: if the authors assume to know the execution duration w(t) of each tuple t in the operator, then they can foresee queuing times and drop all tuples that will cause the queuing latency threshold τ to be violated.
The value of w(t) is generally unknown.
Then, it computes the sum of the estimated execution durations of the tuples assigned to the operator, i.e., Ĉ = ∑ i∈[m]\D ŵ(t).
To enable this approach, LAS builds a sketch on the operator (i.e., a memory efficient data structure) that will track the execution duration of the tuples it processes.
This solution does not require any a priori knowledge on the stream or system, and is designed to continuously adapt to changes in the input stream or on the operator characteristics.

3.2 LAS design

The operator maintains two Count Min [4] sketch matrices : the first one, denoted as F , tracks the tuple frequencies ft; the second one, denoted as W, tracks the tuples cumulated execution durations.
In the positive case the operator sends the F andW matrices to the load shedder , resets their content and moves back to the START state .
While being in the SEND states, LS sends to O the current cumulated execution duration estimation Ĉ piggy backing it with the first tuple t that is not dropped (Listing 3.2 lines 24-26) and moves in the RUN state .
It then checks if the estimated queuing latency for t satisfies the Check method (Listing 3.2 lines 19-21).

4. THEORETICAL ANALYSIS

Data streaming algorithms strongly rely on pseudo-random functions that map elements of the stream to uniformly distributed image values to keep the essential information of the input stream, regardless of the stream elements frequency distribution.
First the authors study the correctness and optimality of the shedding algorithm, under full knowledge assumption (i.e., the shedding strategy is aware of the exact execution duration wt for each tuple t).
The proofs of these equations as well as some numerical applications to illustrate the accuracy are discussed in [8].
Then, according to Theorem 4.1, LAS is an (ε, δ)-optimal algorithm for load shedding, as defined in Problem 2.1, over all possible data streams σ.

5. EXPERIMENTAL EVALUATION

In this section the authors evaluate the performance obtained by using LAS to perform load shedding.
The authors will first describe the general setting used to run the tests and will then discuss the results obtained through simulations and with a prototype of LAS integrated within Apache Storm.
Due to space constraints, the exhaustive presentation of these experiments are available in the companion paper [8].

5.1 Setup

In their tests the authors consider both synthetic and real datasets.
Conversely, an input throughput larger than 1/W will result in an under-provisioned system.
In order to generate 100 different streams, the authors randomize the association between the wn execution duration values and the n distinct items: for each of the wn execution duration values they pick uniformly at random n/wn different values in [n] that will be associated to that execution duration value.
This means that each single experiment reports the mean outcome of 5, 000 independent runs.
Among other information, the tweets are enriched with a field mention containing the entities mentioned in the tweet.

5.2 Simulation Results

In this section the authors analyze through simulations the sensitivity of LAS while varying several characteristics of the input load.
As expected, in this latter case all algorithms perform at the same level as load shedding is superfluous.
At the beginning of this phase both Straw-Man and LAS perform bad, with queuing latencies that are largely above τ .
While the phase unfolds LAS quickly updates its data structures and converges toward the given threshold, while Straw-Man diverges as tuples continue to be enqueued on the operator worsening the bottleneck effect.

5.3 Prototype

The source reads from the dataset and emits the tuples consumed by bolt LS.
The authors assumed that this leads to a long execution duration for media (e.g., possibly caused by an access to an external DB to gather historical data), an average execution duration for politicians and a fast execution duration for others (e.g., possibly because these tweets are not decorated).
Figure 7 reports the average completion latency as the stream unfolds.
Conversely, StrawMan completion latencies are at least one order of magnitude larger.
These results confirm the effectiveness of LAS in keeping a close control on queuing latencies (and thus provide more predictable performance) at the cost of dropping a fraction of the input load.

7. CONCLUSIONS

A novel solution for load shedding in DSPS.the authors.
LAS is based on the observation that load on operators depends both on the input rate and on the content of tuplesLAS leverages sketch data structures to efficiently collect at runtime information on the operator load characteristics and then use this information to implement a load shedding policy aimed at maintaining the average queuing latencies close to a given threshold.
Furthermore, tests conducted both on a simulated environment and on a prototype implementation confirm that by taking into account the specific load imposed by each tuple, LAS can provide performance that closely approach a given target, while dropping a limited number of tuples.

Did you find this useful? Give us your feedback

Figures (5)

Figure 4: Average queuing latency Q varying the amount of under-provisioning.

Figure 7: Average completion latency in the prototype use case.

Figure 1: Load-Aware Shedding design with r = 2 (δ = 0.25), c = 4 (ε = 0.70).

Figure 2: Operator finite state machine.

Figure 3: Load shedder LS finite state machine.

Content maybe subject to copyright Report

HAL Id: hal-01413212

https://hal.inria.fr/hal-01413212

Submitted on 12 Dec 2016

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Load-aware shedding in stream processing systems

Nicoló Rivetti, Yann Busnel, Leonardo Querzoni

To cite this version:

Nicoló Rivetti, Yann Busnel, Leonardo Querzoni. Load-aware shedding in stream processing systems.

10th ACM International Conference on Distributed and Event-based Systems (DEBS), Jun 2016,

Irvine, United States. pp.61 - 68, �10.1145/2933267.2933311�. �hal-01413212�

Load-Aware Shedding in Stream Processing Systems

Nicoló Rivetti

LINA / Université de Nantes,

France

DIAG / Sapienza University of

Rome, Italy

rivetti@dis.uniroma1.it

Yann Busnel

Crest (Ensai) / Inria

Rennes, France

yann.busnel@ensai.fr

Leonardo Querzoni

DIAG / Sapienza University of

Rome, Italy

querzoni@dis.uniroma1.it

ABSTRACT

Load shedding is a technique employed by stream process-

ing systems to handle unpredictable spikes in the input load

whenever available computing resources are not adequately

provisioned. A load shedder drops tuples to keep the input

load below a critical threshold and thus avoid tuple queuing

and system trashing. In this paper we propose Load-Aware

Shedding (LAS), a novel load shedding solution that drops

tuples with the aim of maintaining queuing times below a

tunable threshold. Tuple execution durations are estimated

at runtime using eﬃcient sketch data structures. We pro-

vide a theoretical analysis proving that LAS is an (ε, δ)-

approximation of the optimal online load shedder and show

its performance through a practical evaluation based both

on simulations and on a running prototype.

CCS Concepts

•Software and its engineering → Distributed systems

organizing principles; •Theory of computation → On-

line learning algorithms; Sketching and sampling;

Keywords

Stream Processing, Data Streaming, Load Shedding

1. INTRODUCTION

Distributed stream processing systems (DSPS) are today

considered as a mainstream technology to build architec-

tures for the real-time analysis of big data. An application

running in a DSPS is typically modeled as a directed acyclic

graph where data operators (nodes) are interconnected by

streams of tuples containing data to be analyzed (edges).

The success of such systems can be traced back to their

ability to run complex applications at scale on clusters of

commodity hardware.

Correctly provisioning computing resources for DSPS how-

ever is far from being a trivial task. System designers need

Accepted to the 10th ACM International Conference on Distributed and Event-based

Systems (DEBS ’16)

to take into account several factors: the computational com-

plexity of the operators, the overhead induced by the frame-

work, and the characteristics of the input streams. This

latter aspect is often critical, as input data streams may

unpredictably change over time both in rate and content.

Bursty input load represents a problem for DSPS as it may

create unpredictable bottlenecks within the system that lead

to an increase in queuing latencies, pushing the system in a

state where it cannot deliver the expected quality of service

(typically expressed in terms of tuple completion latency).

Load shedding is generally considered a practical approach

to handle bursty traﬃc. It consists in dropping a subset of

incoming tuples as soon as a bottleneck is detected in the

system.

Existing load shedding solutions either randomly drop tu-

ples when bottlenecks are detected or apply a pre-deﬁned

model of the application and its input that allows them to

deterministically take the best shedding decision. In any

case, all the existing solutions assume that incoming tuples

all impose the same computational load on the DSPS. How-

ever, such assumption does not hold for many practical use

cases. The tuple execution duration, in fact, may depend on

the tuple content itself. This is often the case whenever the

receiving operator implements a logic with branches where

only a subset of the incoming tuples travels through each sin-

gle branch. If the computation associated with each branch

generates diﬀerent loads, then the execution duration will

change from tuple to tuple. A tuple with a large execution

duration may delay the execution of subsequent tuples in the

same stream, thus increasing queuing latencies and possibly

cause the emergence of a bottleneck.

On the basis of this simple observation, we introduce Load-

Aware Shedding (LAS), a novel solution for load shedding in

DSPS. LAS gets rid of the aforementioned assumptions and

provides eﬃcient shedding aimed at matching given queuing

latency goals, while dropping as few tuples as possible. To

reach this goal LAS leverages a smart combination of sketch

data structures to eﬃciently collect at runtime information

on the time needed to compute tuples and thus build and

maintain a cost model that is then exploited to take deci-

sions on when load must be shed. LAS has been designed as

a ﬂexible solution that can be applied on a per-operator ba-

sis, thus allowing developers to target speciﬁc critical stream

paths in their applications.

In summary, the contributions provided by this paper are

(i) the introduction of LAS, the ﬁrst solution for load shed-

ding in DSPS that proactively drops tuples to avoid bottle-

necks without requiring a predeﬁned cost model and without

any assumption on the distribution of tuples, (ii) a theo-

retical analysis of LAS that points out how it is an (, δ)-

approximation of the optimal online shedding algorithm and,

ﬁnally, (iii) an experimental evaluation that illustrates how

LAS can provide predictable queuing latencies that approx-

imate a given threshold while dropping a small fraction of

the incoming tuples.

Below, the next section states the system model we con-

sider. Afterwards, Section 3 details LAS whose behavior is

then theoretically analyzed in Section 4. Section 5 reports

on our experimental evaluation and Section 6 analyzes the

related works. Finally Section 7 concludes the paper.

2. SYSTEM MODEL AND PROBLEM DEF-

INITION

We consider a distributed stream processing system (DSPS)

deployed on a cluster where several computing nodes ex-

change data through messages sent over a network. The

DSPS executes a stream processing application represented

by a topology: a directed acyclic graph interconnecting op-

erators, represented by vertices, with data streams (DS),

represented by edges.

Data injected by source operators is encapsulated in units

called tuples and each data stream is an unbounded sequence

of tuples. Without loss of generality, here we assume that

each tuple t is a ﬁnite set of key/value pairs that can be

customized to represent complex data structures. To sim-

plify the discussion, in the rest of this work we deal with

streams of unary tuples each representing a single non nega-

tive integer value. We also restrict our model to a topology

with an operator LS (load shedder) that decides which tu-

ples of its outbound DS σ consumed by operator O shall

be dropped. Tuples in σ are drawn from a large universe

[n] = {1, . . . , n} and are ordered, i.e., σ = ht

, . . . , t

Therefore [m] = 1, . . . , m is the index sequence associated

with the m tuples contained in the stream σ. Both m and

n are unknown. We denote with f

the unknown frequency

of tuple t, i.e., the number of occurrences of t in σ.

We assume that the execution duration of tuple t on oper-

ator O, denoted as w(t), depends on the content of t. In par-

ticular, without loss of generality, we consider a case where

w depends on a single, ﬁxed and known attribute value of

t. The probability distribution of such attribute values, as

well as w, are unknown, may diﬀer from operator to oper-

ator and may change over time. However, we assume that

subsequent changes are interleaved by a large enough time

frame such that an algorithm may have a reasonable amount

of time to adapt. On the other hand, the input throughput

of the stream may vary, even with a large magnitude, at any

time.

Let q(i) be the queuing latency of the i-th tuple of the

stream, i.e., the time spent by the i-th tuple in the inbound

buﬀer of operator O before being processed. Let us denote

as D ⊆ [m], the set of dropped tuples in a stream of length

m, i.e., dropped tuples are thus represented in D by their

indices in the stream [m]. Moreover, let d ≤ m be the

number of dropped tuples in a stream of length m, i.e.,

d = |D|. We can deﬁne the average queuing latency as:

Q(j) =

i∈[j]\D

q(i)/(j − d) for all j ∈ [m].

The goal of the load shedder is to maintain the average

queuing latency smaller than a given threshold τ by drop-

ping as less tuples as possible while the stream unfolds. The

quality of the shedder can be evaluated both by comparing

the resulting Q against τ and by measuring the number of

dropped tuples d. More formally, the load shedding problem

can be deﬁned as follows

Problem 2.1 (Load Shedding). Given a data stream

σ = ht

, . . . , t

i, ﬁnd the smallest set D such that

∀j ∈ [m] \ D, Q(j) ≤ τ.

3. LOAD AWARE SHEDDING

3.1 Overview

Load-Aware Shedding (LAS) is based on a simple, yet

eﬀective, idea: if we assume to know the execution duration

w(t) of each tuple t in the operator, then we can foresee

queuing times and drop all tuples that will cause the queuing

latency threshold τ to be violated. However, the value of

w(t) is generally unknown.

LAS builds and maintain at run-time a cost model for tu-

ple execution durations. It takes shedding decision based on

the estimation

C of the total execution duration of the oper-

ator: C =

i∈[m]\D

w(t

). In order to do so, LAS computes

an estimation ˆw(t) of the execution duration w(t) of each

tuple t. Then, it computes the sum of the estimated exe-

cution durations of the tuples assigned to the operator, i.e.,

C =

i∈[m]\D

ˆw(t). At the arrival of the i-th tuple, subtract-

ing from

C the (physical) time elapsed from the emission of

the ﬁrst tuple provides LAS with an estimation ˆq(i) of the

queuing latency q(i) for the current tuple.

To enable this approach, LAS builds a sketch on the oper-

ator (i.e., a memory eﬃcient data structure) that will track

the execution duration of the tuples it processes. When

a change in the stream or operator characteristics aﬀects

the tuples execution durations w(t), i.e., the sketch con-

tent changes, the operator will forward an updated version

to the load shedder, which will than be able to (again) cor-

rectly estimate the tuples execution durations. This solution

does not require any a priori knowledge on the stream or

system, and is designed to continuously adapt to changes in

the input stream or on the operator characteristics.

3.2 LAS design

The operator maintains two Count Min [4] sketch matrices

(Figure 1.A): the ﬁrst one, denoted as F, tracks the tuple

frequencies f

; the second one, denoted as W, tracks the tu-

ples cumulated execution durations W

= w(t) × f

. Both

Count Min matrices share the same sizes and 2-universal

hash functions [3]. The latter is a generalized version of the

Count Min providing (ε, δ)-additive-approximation of point

queries on stream of updates whose value is the tuple execu-

tion duration when processed by the instance. The operator

will update (Listing 3.1 lines 27-30) both matrices after each

tuple execution.

The operator is modeled as a ﬁnite state machine (Fig-

ure 2) with two states: START and STABILIZING. The

START state lasts as long as the operator has executed N

tuples, where N is a user deﬁned window size parameter.

The transition to the STABILIZING state (Figure 2.A)

This is not the only possible deﬁnition of the load shedding

problems. Other variants are brieﬂy discussed in section 6.

1 2 3 4

hF, Wi

LAS

hF, Wi

htuplei | htuple,

hF, Wi

h∆i

Figure 1: Load-Aware Shedding design with r = 2

(δ = 0.25), c = 4 (ε = 0.70).

triggers the creation of a new snapshot S. A snapshot is

a matrix of size r × c where ∀i ∈ [r], j ∈ [c] : S[i, j] =

W[i, j]/F[i, j] (Listing 3.1 lines 15-17). We say that the F

and W matrices are stable when the relative error η between

the previous snapshot and the current one is smaller than a

conﬁgurable parameter µ, i.e.,

η =

∀i,j

|S[i, j] −

W[i,j]

F[i,j])

∀i,j

S[i, j]

≤ µ (1)

is satisﬁed. Then, each time the operator has executed N tu-

ples (Listing 3.1 lines 18-25), it checks whether Equation 1 is

satisﬁed. (i) In the negative case S is updated (Figure 2.B).

(ii) In the positive case the operator sends the F and W ma-

trices to the load shedder (Figure 1.B), resets their content

and moves back to the START state (Figure 2.C).

There is a delay between any change in w(t) and when LS

receives the updated F and W matrices. This introduces a

skew in the cumulated execution duration estimated by LS.

In order to compensate this skew, we introduce a synchro-

nization mechanism that kicks in whenever the LS receives

a new pair of matrices from the operator.

The LS (Figure 1.C) maintains the estimated cumulated

execution duration of the operator

C and a pairs of initially

empty matrices hF, Wi. LS is modeled as a ﬁnite state ma-

chine (Figure 3) with three states: NOP, SEND and RUN.

The LS executes the code reported in Listing 3.2. In partic-

ular, every time a new tuple t arrives at the LS, the function

shed is executed. The LS starts in the NOP state where

no action is performed (Listing 3.2 lines 15-17). Here we

assume that in this initial phase, i.e., when the topology

has just been deployed, no load shedding is required. When

LS receives the ﬁrst pair hF, Wi of matrices (Figure 3.A),

it moves into the SEND state and updates its local pair of

matrices (Listing 3.2 lines 7-9). While being in the SEND

states, LS sends to O the current cumulated execution du-

ration estimation

C (Figure 1.D) piggy backing it with the

ﬁrst tuple t that is not dropped (Listing 3.2 lines 24-26) and

moves in the RUN state (Figure 3.B). This informations is

used to synchronize the LS with O and remove the skew

between O’s cumulated execution duration C and the esti-

mation

C at LS. O replies to this request (Figure 1.E) with

the diﬀerence ∆ = C −

C (Listing 3.1 lines 11-13). When the

load shedder receives the synchronization reply (Figure 3.C)

it updates its estimation

C + ∆ (Listing 3.2 lines 11-13).

Listing 3.1: Operator

1: init do

2: F ← 0

r,c

 zero matrices of size r × c

3: W ← 0

r,c

4: S ← 0

r,c

5: r hash functions h

, . . . , h

: [n] → [c] from a 2-universal

family.

6: m ← 0

7: state ← Start

8: end init

9: function update(tuple : t, execution time : l, request :

10: m ← m + 1

11: if

C not null then

12: ∆ ← C −

13: send h∆i to LS

14: end if

15: if state = Start ∧ m mod N = 0 then  Figure 2.A

16: update S

17: state ← Stabilizing

18: else if state = Stabilizing ∧ m mod N = 0 then

19: if η ≤ µ (Eq. 1) then  Figure 2.C

20: send hF, Wi to LS

21: state ← Start

22: reset F and W to 0

r,c

23: else  Figure 2.B

24: update S

25: end if

26: end if

27: for i = 1 to r do

28: F[i, h

(t)] ← F[i, h

(t)] + 1

29: W[i, h

(t)] ← W[i, h

(t)] + l

30: end for

31: end function

start stabilizing

execute N tuples

create snapshot S

execute N tuples ∧ relative error η ≤ µ

send F and W to scheduler and reset them

execute N tuples ∧

relative error η > µ

update snapshot S

Figure 2: Operator ﬁnite state machine.

In the RUN state, the load shedder computes, for each

tuple t, the estimated queuing latency ˆq(i) as the diﬀerence

between the operator estimated execution duration

C and

the time elapsed from the emission of the ﬁrst tuple (List-

ing 3.2 line 18). It then checks if the estimated queuing

latency for t satisﬁes the Check method (Listing 3.2 lines

19-21).

This method encapsulates the logic for checking if a de-

sired condition on queuing latencies is violated or not. In

this paper, as stated in Section 2, we aim at maintaining the

average queuing latency below a threshold τ. Then, Check

tries to add ˆq to the current average queuing latency (List-

ing 3.2 lines 31). If the result is larger than τ (i), it simply

returns true; otherwise (ii), it updates its local value for

the average queuing latency and returns f alse (Listing 3.2

lines 34-36). Note that diﬀerent goals, based on the queuing

latency, can be deﬁned and encapsulated within Check.

If Check(ˆq) returns true, (i) the load shedder returns

true as well, i.e., tuple t must be dropped. Otherwise (ii),

the operator estimated execution duration

C is updated with

Send RUN

NOP

synhcronization request

sent

received reply

resynchronize

received F and W

update local F and W

Figure 3: Load shedder LS ﬁnite state machine.

the estimated tuple execution duration ˆw(t), increased by a

factor 1 + ε to mitigate potential under-estimations

, and

the load shedder returns false (Listing 3.2 line 28), i.e.,

the tuple must not be dropped. Finally, if the load shed-

der receives a new pair hF, Wi of matrices (Figure 3.D), it

will again update its local pair of matrices and move to the

SEND state (Listing 3.2 lines 7-9).

Now we will brieﬂy discuss the complexity of LAS.

Theorem 3.1 (Time complexity of LAS).

For each tuple read from the input stream, the time complex-

ity of LAS for the operator and the load shedder is O(log 1/δ).

Theorem 3.2 (Space Complexity of LAS).

The space complexity of LAS for the operator and load shed-

der is O



log

(log m + log n)



bits.

Theorem 3.3 (Communication complexity of LAS).

The communication complexity of LAS is of O





messages

and O



log

(log m + log n) + log m



bits.

Note that the communication cost is low with respect to

the stream size since the window size N should be chosen

such that N  1 (e.g., in our tests we have N = 1024).

Proofs for the previous theorems are available in [8].

4. THEORETICAL ANALYSIS

Data streaming algorithms strongly rely on pseudo-random

functions that map elements of the stream to uniformly dis-

tributed image values to keep the essential information of the

input stream, regardless of the stream elements frequency

distribution. Here we provide the analysis of the quality

of the shedding performed by LAS in two steps. First we

study the correctness and optimality of the shedding algo-

rithm, under full knowledge assumption (i.e., the shedding

strategy is aware of the exact execution duration w

for each

tuple t). For the sake of clarity, and to abide to space limits,

all the proofs are available in a companion paper [8].

We suppose that tuples cannot be preempted, i.e. they

must be processed in an uninterrupted fashion on the avail-

able operator instance. Given our system model, here we

consider the problem of minimizing d, the number of dropped

tuples, while guaranteeing that the average queuing latency

Q(t) will be upper-bounded by τ, ∀t ∈ σ. The solution must

work online, thus the decision of enqueueing or dropping a

This correction factor derives from the fact that ˆw(t) is a

(ε, δ)-approximation of w(t) as shown in Section 4.

Listing 3.2: Load shedder

1: init do

C ← 0

3: hF, Wi ← h0

r,c

, 0

r,c

i  zero matrices pair of size r × c

4: Same hash functions h

. . . h

of the operator

5: state ← NOP

6: end init

7: upon hF

, W

i do  Figure 3.A and 3.D

8: state ← Send

9: hF, Wi ← hF

, W

10: end upon

11: upon h∆i do  Figure 3.C

12:

C ←

C + ∆

13: end upon

14: function shed(tuple : t)

15: if state = NOP then

16: return false

17: end if

18: ˆq ←

C− elapsed time from ﬁrst tuple

19: if Check(ˆq) then

20: return true

21: end if

22: i ← arg min

i∈[r]

{F[i, h

(t)]}

23:

C ←

C + (W[i, h

(t)]/F[i, h

(t)]) × (1 + ε)

24: if state = Send then  Figure 3.B

25: piggy back

C to operator on t

26: state ← Run

27: end if

28: return false

29: end function

30: function Check(q)

31: if Q/ > τ then

32: return true

33: end if

34: Q ← Q + q

35:  ←  + 1

36: return false

37: end function

tuple has to be made only resorting to knowledge about tu-

ples received so far in the stream.

Let OPT be the online algorithm that provides the opti-

mal solution to Problem 2.1. We denote with D

OP T

(resp.

OP T

) the set of dropped tuple indices (resp. the number

of dropped tuples) produced by the OPT algorithm fed by

stream σ (cf., Section 2). We also denote with d

LAS

the

number of dropped tuples produced by LAS introduced in

Section 3.2 fed with the same stream σ.

Theorem 4.1 (LAS Correctness & Optimality).

For any σ, we have d

LAS

= d

OP T

and ∀t ∈ σ,

LAS

(t) ≤ τ.

Now we remove the previous assumptions and analyze the

approximation made on execution duration w(t) for each

tuple t. LAS uses two matrices, F and W, to estimate the

execution time w(t) of each tuple submitted to the operator.

By the Count Min sketch algorithm [4] and Listing 3.1, we

have that for any t ∈ [n] and for each row i ∈ [r],

F[i][h

(t)] = f

u=1,u6=t

(u)=h

(t)}

and

W[i][h

(t)] = f

w(t) +

u=1,u6=t

w(u)1

(u)=h

(t)}

HTML Viewer

Frequently Asked Questions (9)

Q1. What have the authors contributed in "Load-aware shedding in stream processing systems" ?

In this paper the authors propose Load-Aware Shedding ( LAS ), a novel load shedding solution that drops tuples with the aim of maintaining queuing times below a tunable threshold. The authors provide a theoretical analysis proving that LAS is an ( ε, δ ) approximation of the optimal online load shedder and show its performance through a practical evaluation based both on simulations and on a running prototype.

Q2. How do the authors fine tune the parameter r to log(1/)?

By finely tuning the parameter r to log(1/δ), under the assumption of [8], the authors are then able to (ε, δ)-approximate w(t) for any t ∈ [n].

Q3. What is the first stream processing system where shedding has been proposed?

Aurora [1] is the first stream processing system where shedding has been proposed as a technique to deal with bursty input traffic.

Q4. What is the definition of bursty input load?

Bursty input load represents a problem for DSPS as it may create unpredictable bottlenecks within the system that lead to an increase in queuing latencies, pushing the system in a state where it cannot deliver the expected quality of service (typically expressed in terms of tuple completion latency).

Q5. What is the LAS cost model for shedding tuples?

At the arrival of the i-th tuple, subtracting from Ĉ the (physical) time elapsed from the emission of the first tuple provides LAS with an estimation q̂(i) of the queuing latency q(i) for the current tuple.

Q6. What is the transition to phase F?

The transition to phase F is extremely abrupt as the input throughput is brought back to the equivalent of 0% of under-provisioning, but the cost to handle each tuple on the operator is doubled.

Q7. What is the transition to phase C?

The transition to phase C brings the system back in the initial configuration, while in phase D the change in the tuple frequency distribution is managed very differently by each solution: both Full Knowledge and LAS compensate this change by starting to drop more tuples, but still maintaining the average queuing latency close to the desired threshold τ .

Q8. What is the result of the sketch data structures used to trace tuples?

This result stems from the fact that the sketch data structures used to trace tuple execution durations perform at their best on strongly skewed distribution, rather than on uniform ones.

Q9. What is the simplest way to restrict a tuple to a topology?

The authors also restrict their model to a topology with an operator LS (load shedder) that decides which tuples of its outbound DS σ consumed by operator O shall be dropped.

Load-aware shedding in stream processing systems

Summary (3 min read)

1. INTRODUCTION

2. SYSTEM MODEL AND PROBLEM DEFINITION

3.1 Overview

3.2 LAS design

4. THEORETICAL ANALYSIS

5. EXPERIMENTAL EVALUATION

5.1 Setup

5.2 Simulation Results

5.3 Prototype

7. CONCLUSIONS

Figures (5)

Citations

Cites background from "Load-aware shedding in stream proce..."

Cites background from "Load-aware shedding in stream proce..."

Cites background from "Load-aware shedding in stream proce..."

References

"Load-aware shedding in stream proce..." refers methods in this paper

"Load-aware shedding in stream proce..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (9)

Q1. What have the authors contributed in "Load-aware shedding in stream processing systems" ?

Q2. How do the authors fine tune the parameter r to log(1/)?

Q3. What is the first stream processing system where shedding has been proposed?

Q4. What is the definition of bursty input load?

Q5. What is the LAS cost model for shedding tuples?

Q6. What is the transition to phase F?

Q7. What is the transition to phase C?

Q8. What is the result of the sketch data structures used to trace tuples?

Q9. What is the simplest way to restrict a tuple to a topology?