On Load Shedding in Complex Event Processing

Home
/
Papers
/
On Load Shedding in Complex Event Processing

Proceedings Article•

On Load Shedding in Complex Event Processing

Yeye He¹, Siddharth Barman², Jeffrey F. Naughton³•Institutions (3)

Microsoft¹, California Institute of Technology², University of Wisconsin-Madison³

01 Mar 2014-pp 213-224

TL;DR: This paper formalizes broad classes of CEP load-shedding scenarios as different optimization problems and demonstrates an array of complexity results that reveal the hardness of these problems and construct shedding algorithms with performance guarantees.

read less

Abstract: Complex Event Processing (CEP) is a stream processing model that focuses on detecting event patterns in continuous event streams. While the CEP model has gained popularity in the research communities and commercial technologies, the problem of gracefully degrading performance under heavy load in the presence of resource constraints, or load shedding, has been largely overlooked. CEP is similar to “classical” stream data management, but addresses a substantially different class of queries. This unfortunately renders the load shedding algorithms developed for stream data processing inapplicable. In this paper we study CEP load shedding under various resource constraints. We formalize broad classes of CEP load-shedding scenarios as different optimization problems. We demonstrate an array of complexity results that reveal the hardness of these problems and construct shedding algorithms with performance guarantees. Our results shed some light on the difficulty of developing load-shedding algorithms that maximize utility.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

When things matter

[...]

Yongrui Qin¹, Quan Z. Sheng¹, Nickolas Falkner¹, Schahram Dustdar², Hua Wang³, Athanasios V. Vasilakos⁴ - Show less +2 more•Institutions (4)

University of Adelaide¹, Vienna University of Technology², Victoria University, Australia³, National Technical University of Athens⁴

01 Apr 2016-Journal of Network and Computer Applications

TL;DR: The main techniques and state-of-the-art research efforts in IoT from data-centric perspectives are reviewed, including data stream processing, data storage models, complex event processing, and searching in IoT.

...read moreread less

289 citations

Posted Content•

When Things Matter: A Data-Centric View of the Internet of Things

[...]

Yongrui Qin¹, Quan Z. Sheng, Nickolas Falkner, Schahram Dustdar, Hua Wang, Athanasios V. Vasilakos - Show less +2 more•Institutions (1)

Victoria University, Australia¹

10 Jul 2014-arXiv: Databases

TL;DR: The main techniques and state-of-the-art research efforts in IoT from data-centric perspectives are surveyed, including data stream processing, data storage models, complex event processing, and searching in IoT.

...read moreread less

Abstract: With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed.

...read moreread less

43 citations

Cites background from "On Load Shedding in Complex Event P..."

...For example, Heinze et al. [2013] study complex event processing in a distributed environment and propose FUGU – an elastic allocator for Complex Event Processing systems....
[...]
...Very recently, He et al. [2014] investigate load shedding techniques for complex event processing under various resource constraints....
[...]

Proceedings Article•DOI•

Load-aware shedding in stream processing systems

[...]

Nicolo Rivetti¹, Yann Busnel², Leonardo Querzoni¹•Institutions (2)

Sapienza University of Rome¹, French Institute for Research in Computer Science and Automation²

13 Jun 2016

TL;DR: This paper provides a theoretical analysis proving that LAS is an (ε, δ)-approximation of the optimal online load shedder and shows its performance through a practical evaluation based both on simulations and on a running prototype.

...read moreread less

Abstract: Load shedding is a technique employed by stream processing systems to handle unpredictable spikes in the input load whenever available computing resources are not adequately provisioned. A load shedder drops tuples to keep the input load below a critical threshold and thus avoid tuple queuing and system trashing. In this paper we propose Load-Aware Shedding (LAS), a novel load shedding solution that drops tuples with the aim of maintaining queuing times below a tunable threshold. Tuple execution durations are estimated at runtime using efficient sketch data structures. We provide a theoretical analysis proving that LAS is an (e, δ)-approximation of the optimal online load shedder and show its performance through a practical evaluation based both on simulations and on a running prototype.

...read moreread less

31 citations

Cites background from "On Load Shedding in Complex Event P..."

...in [5] specialized the problem to the case of complex event processing....
[...]

Journal Article•DOI•

Microblogs data management: a survey

[...]

Amr Magdy¹, Laila Abdelhafeez¹, Yunfan Kang¹, Eric Ong¹, Mohamed F. Mokbel² - Show less +1 more•Institutions (2)

University of California, Riverside¹, University of Minnesota²

01 Jan 2020

TL;DR: This paper reviews core components that enable large-scale querying and indexing for microblogs data, and discusses system-level issues and on-going effort on supporting microblogs through the rising wave of big data systems.

...read moreread less

Abstract: Microblogs data is the microlength user-generated data that is posted on the web, e.g., tweets, online reviews, comments on news and social media. It has gained considerable attention in recent years due to its widespread popularity, rich content, and value in several societal applications. Nowadays, microblogs applications span a wide spectrum of interests including targeted advertising, market reports, news delivery, political campaigns, rescue services, and public health. Consequently, major research efforts have been spent to manage, analyze, and visualize microblogs to support different applications. This paper gives a comprehensive review of major research and system work in microblogs data management. The paper reviews core components that enable large-scale querying and indexing for microblogs data. A dedicated part gives particular focus for discussing system-level issues and on-going effort on supporting microblogs through the rising wave of big data systems. In addition, we review the major research topics that exploit these core data management components to provide innovative and effective analysis and visualization for microblogs, such as event detection, recommendations, automatic geotagging, and user queries. Throughout the different parts, we highlight the challenges, innovations, and future opportunities in microblogs data research.

...read moreread less

23 citations

Cites background from "On Load Shedding in Complex Event P..."

...ment in database systems [97], anti-caching inmain-memory databases [85,197,374], and load shedding in data stream management systems [33,112,138], flushing in microblogs...
[...]

Proceedings Article•DOI•

Load Shedding for Complex Event Processing: Input-based and State-based Techniques

[...]

Bo Zhao¹, Nguyen Quoc Viet Hung², Matthias Weidlich¹•Institutions (2)

Humboldt University of Berlin¹, Griffith University²

20 Apr 2020

TL;DR: This work introduces a hybrid model that combines both input-based and statebased shedding to achieve high result quality under constrained resources and indicates that such hybrid shedding improves the recall by up to 14× for synthetic data and 11.4× for real-world data, compared to baseline approaches.

...read moreread less

Abstract: Complex event processing (CEP) systems that evaluate queries over streams of events may face unpredictable input rates and query selectivities. During short peak times, exhaustive processing is then no longer reasonable, or even infeasible, and systems shall resort to best-effort query evaluation and strive for optimal result quality while staying within a latency bound. In traditional data stream processing, this is achieved by load shedding that discards some stream elements without processing them based on their estimated utility for the query result.We argue that such input-based load shedding is not always suitable for CEP queries. It assumes that the utility of each individual element of a stream can be assessed in isolation. For CEP queries, however, this utility may be highly dynamic: Depending on the presence of partial matches, the impact of discarding a single event can vary drastically. In this work, we therefore complement input-based load shedding with a statebased technique that discards partial matches. We introduce a hybrid model that combines both input-based and statebased shedding to achieve high result quality under constrained resources. Our experiments indicate that such hybrid shedding improves the recall by up to 14× for synthetic data and 11.4× for real-world data, compared to baseline approaches.

...read moreread less

20 citations

Cites background or methods from "On Load Shedding in Complex Event P..."

...The characteristics and the complexity of load shedding for CEP has been discussed in [24]....
[...]
...This is infeasible for CEP [24], due to the high volatility of query selectivity and, therefore, processing rates of a system....
[...]
...The aforementioned techniques are not applicable for CEP, though [24], as we discuss based on the questions of when to shed (Q1); what to shed (Q2); and how much to shed (Q3)....
[...]
...Against this background, CEP systems shall employ besteffort processing, when resource demands peak [24]....
[...]

1
2
3
4
…

References

PDF

Open Access

More filters

Journal Article•DOI•

iCBS: incremental cost-based scheduling under piecewise linear SLAs

[...]

Yun Chi, Hyun Jin Moon, Hakan Hacigumus

01 Jun 2011

TL;DR: iCBS takes the query costs derived from the service level agreements between the service provider and its customers into account to make cost-aware scheduling decisions, and reduces the online time complexity from O(N) for the original version CBS to O(log2 N) for iCBS.

...read moreread less

Abstract: In a cloud computing environment, it is beneficial for the cloud service provider to offer differentiated services among different customers, who often have different cost profiles. Therefore, cost-aware scheduling of queries is important. A practical cost-aware scheduling algorithm must be able to handle the highly demanding query volumes in the scheduling queues to make online scheduling decisions very quickly. We develop such a highly efficient cost-aware query scheduling algorithm, called iCBS. iCBS takes the query costs derived from the service level agreements (SLAs) between the service provider and its customers into account to make cost-aware scheduling decisions. iCBS is an incremental variation of an existing scheduling algorithm, CBS. Although CBS exhibits an exceptionally good cost performance, it has a prohibitive time complexity. Our main contributions are (1) to observe how CBS behaves under piecewise linear SLAs, which are very common in cloud computing systems, and (2) to efficiently leverage these observations and to reduce the online time complexity from O(N) for the original version CBS to O(log2N) for iCBS.

...read moreread less

61 citations

"On Load Shedding in Complex Event P..." refers background in this paper

...They have a financial incentive to judiciously shed work from queries that are associated with a low penalty cost as specified in Service Level Agreements (SLAs), so that their profits can be maximized (similar problems have been called “profit maximization in a cloud” and have been considered in the Database-as-a-Service literature [18, 19])....
[...]

Random walk in a simplex and quadratic optimization over convex polytopes

[...]

Yu. Nesterov¹•Institutions (1)

University College London¹

01 Jan 2003

TL;DR: Probabilistic arguments for justifying the quality of an approximate solution for global quadratic minimization problem are developed, obtained as a best point among all points of a uniform grid inside a polyhedral feasible set and some related problems are shown to be NP-hard.

...read moreread less

Abstract: In this paper we develop probabilistic arguments for justifying the quality of an approximate solution for global quadratic minimization problem, obtained as a best point among all points of a uniform grid inside a polyhedral feasible set. Our main tool is a random walk inside the standard simplex, for which it is easy to find explicit probabilistic characteristics. For any integer k = 1 we can generate an approximate solution with relative accuracy 1k provided that the quadratic objective function is non-negative in all nodes of the feasible set. The complexity of the process is polynomial in the number of nodes and in the dimension of the space of variables. We extend some of the results to problems with polynomial objective function. We conclude the paper by showing that some related problems (maximization of cubic or quartic form over the Euclidean ball, and the matrix ellipsoid problem) are NP-hard.

...read moreread less

60 citations

Proceedings Article•DOI•

SLA-tree: a framework for efficiently supporting SLA-based decisions in cloud computing

[...]

Yun Chi, Hyun Jin Moon, Hakan Hacigumus, Junichi Tatemura

21 Mar 2011

TL;DR: This paper proposes a novel data structure, called SLA-tree, to efficiently support profit-oriented decision making in cloud computing, and efficiently support the answering of certain profit- oriented "what if" questions.

...read moreread less

Abstract: As cloud computing becomes increasingly important in database systems, many new challenges and opportunities have arisen. One challenge is that in cloud computing, business profit plays a central role. Hence, it is very important for a cloud service provider to quickly make profit-oriented decisions. In this paper, we propose a novel data structure, called SLA-tree, to efficiently support profit-oriented decision making. SLA-tree is built on two pieces of information: (1) a set of buffered queries waiting to be executed, which represents the scheduled events that will happen in the near future, and (2) a service level agreement (SLA) for each query, which indicates the different profits for the query for varying query response times. By constructing the SLA-tree, we efficiently support the answering of certain profit-oriented "what if" questions. Answers to these questions in turn can be applied to different profit-oriented decisions in cloud computing such as profit-aware scheduling, dispatching, and capacity planning. Extensive experimental results based on both synthetic and real-world data demonstrate the effectiveness and efficiency of our SLA-tree framework.

...read moreread less

58 citations

Proceedings Article•DOI•

What is "next" in event processing?

[...]

Walker White¹, Mirek Riedewald¹, Johannes Gehrke¹, Alan Demers¹•Institutions (1)

Cornell University¹

11 Jun 2007

TL;DR: A formal framework is created and it is shown that there is a unique model up to isomorphism that satisfies the standard axioms and supports associativity, so this model is ideally suited to be the standard temporal model for complex event processing.

...read moreread less

Abstract: Event processing systems have wide applications ranging from managing events from RFID readers to monitoring RSS feeds. Consequently, there exists much work on them in the literature. The prevalent use of these systems is on-line recognition of patterns that are sequences of correlated events in event streams. Query semantics and implementation efficiency are inherently determined by the underlying temporal model: how events are sequenced (what is the "next" event), and how the time stamp of an event is represented. Many competing temporal models for event systems have been proposed, with no consensus on which approach is best.We take a foundational approach to this problem. We create a formal framework and present event system design choices as axioms. The axioms are grouped into standard axioms and desirable axioms. Standard axioms are common to the design of all event systems. Desirable axioms are not always satisfied, but are useful for achieving high performance. Given these axioms, we prove several important results. First, we show that there is a unique model up to isomorphism that satisfies the standard axioms and supports associativity, so our axioms are a sound and complete axiomatization of associative time stamps in eventsystems. This model requires time stamps with unbounded representations. We present a slightly weakened version of associativity that permits a temporal model with bounded representations. We show that adding the boundedness condition also results in a unique model, so again our axiomatization is sound and complete. We believe this model is ideally suited to be the standard temporal model for complex event processing.

...read moreread less

55 citations

"On Load Shedding in Complex Event P..." refers background in this paper

...In this case “shedding” all events of type B and D will sacrifice the results ofQ3 but preserves A, C and E and meets the memory constraint....
[...]
...D B ] 16 D ec 2 01 3 Complex Event Processing (CEP) is a stream processing modelthat focuses on detecting event patterns in continuous event streams....
[...]

Proceedings Article•DOI•

Balancing load in stream processing with the cloud

[...]

Wilhelm Kleiminger¹, Evangelia Kalyvianaki², Peter Pietzuch²•Institutions (2)

ETH Zurich¹, Imperial College London²

11 Apr 2011

TL;DR: A combined stream processing system that adaptively balances workload between a dedicated local stream processor and a cloud stream processor, and can adapt effectively to workload variations, while only discarding a small percentage of input data is presented.

...read moreread less

Abstract: Stream processing systems must handle stream data coming from real-time, high-throughput applications, for example in financial trading. Timely processing of streams is important and requires sufficient available resources to achieve high throughput and deliver accurate results. However, static allocation of stream processing resources in terms of machines is inefficient when input streams have significant rate variations—machines remain underutilised for long periods of average load. We present a combined stream processing system that, as the input stream rate varies, adaptively balances workload between a dedicated local stream processor and a cloud stream processor. This approach only utilises cloud machines when the local stream processor becomes overloaded. We evaluate a prototype system with financial trading data. Our results show that it can adapt effectively to workload variations, while only discarding a small percentage of input data.

...read moreread less

43 citations