High-performance complex event processing over streams

doi:10.1145/1142473.1142520

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Toward Scalable Systems for Big Data Analytics: A Technology Tutorial

[...]

Han Hu¹, Yonggang Wen², Tat-Seng Chua¹, Xuelong Li³•Institutions (3)

National University of Singapore¹, Nanyang Technological University², Chinese Academy of Sciences³

24 Jun 2014-IEEE Access

TL;DR: This paper presents a systematic framework to decompose big data systems into four sequential modules, namely data generation, data acquisition, data storage, and data analytics, and presents the prevalent Hadoop framework for addressing big data challenges.

...read moreread less

Abstract: Recent technological advancements have led to a deluge of data from distinctive domains (e.g., health care and scientific sensors, user-generated data, Internet and financial companies, and supply chain systems) over the past two decades. The term big data was coined to capture the meaning of this emerging trend. In addition to its sheer volume, big data also exhibits other unique characteristics as compared with traditional data. For instance, big data is commonly unstructured and require more real-time analysis. This development calls for new system architectures for data acquisition, transmission, storage, and large-scale data processing mechanisms. In this paper, we present a literature survey and system tutorial for big data analytics platforms, aiming to provide an overall picture for nonexpert readers and instill a do-it-yourself spirit for advanced audiences to customize their own big-data solutions. First, we present the definition of big data and discuss big data challenges. Next, we present a systematic framework to decompose big data systems into four sequential modules, namely data generation, data acquisition, data storage, and data analytics. These four modules form a big data value chain. Following that, we present a detailed survey of numerous approaches and mechanisms from research and industry communities. In addition, we present the prevalent Hadoop framework for addressing big data challenges. Finally, we outline several evaluation benchmarks and potential research directions for big data systems.

...read moreread less

1,002 citations

Cites background from "High-performance complex event proc..."

...A shoplifting example that uses highlevel complex events is discussed in [254]....
[...]

Journal Article•DOI•

Processing flows of information: From data stream to complex event processing

[...]

Gianpaolo Cugola¹, Alessandro Margara¹•Institutions (1)

Polytechnic University of Milan¹

14 Jun 2012-ACM Computing Surveys

TL;DR: A general, unifying model is proposed to capture the different aspects of an IFP system and use it to provide a complete and precise classification of the systems and mechanisms proposed so far.

...read moreread less

Abstract: A large number of distributed applications requires continuous and timely processing of information as it flows from the periphery to the center of the system. Examples include intrusion detection systems which analyze network traffic in real-time to identify possible attacks; environmental monitoring applications which process raw data coming from sensor networks to identify critical situations; or applications performing online analysis of stock prices to identify trends and forecast future values.Traditional DBMSs, which need to store and index data before processing it, can hardly fulfill the requirements of timeliness coming from such domains. Accordingly, during the last decade, different research communities developed a number of tools, which we collectively call Information flow processing (IFP) systems, to support these scenarios. They differ in their system architecture, data model, rule model, and rule language. In this article, we survey these systems to help researchers, who often come from different backgrounds, in understanding how the various approaches they adopt may complement each other.In particular, we propose a general, unifying model to capture the different aspects of an IFP system and use it to provide a complete and precise classification of the systems and mechanisms proposed so far.

...read moreread less

918 citations

Cites background from "High-performance complex event proc..."

...Sase [Wu et al. 2006] is a monitoring system designed to perform complex queries over real-time flows of RFID readings....
[...]

Proceedings Article•DOI•

Efficient pattern matching over event streams

[...]

Jagrati Agrawal¹, Yanlei Diao¹, Daniel Gyllstrom¹, Neil Immerman¹•Institutions (1)

University of Massachusetts Amherst¹

09 Jun 2008

TL;DR: This paper presents a formal evaluation model that offers precise semantics for this new class of queries and a query evaluation framework permitting optimizations in a principled way and further analyzes the runtime complexity of query evaluation using this model and develops a suite of techniques that improve runtime efficiency by exploiting sharing in storage and processing.

...read moreread less

Abstract: Pattern matching over event streams is increasingly being employed in many areas including financial services, RFIDbased inventory management, click stream analysis, and electronic health systems. While regular expression matching is well studied, pattern matching over streams presents two new challenges: Languages for pattern matching over streams are significantly richer than languages for regular expression matching. Furthermore, efficient evaluation of these pattern queries over streams requires new algorithms and optimizations: the conventional wisdom for stream query processing (i.e., using selection-join-aggregation) is inadequate.In this paper, we present a formal evaluation model that offers precise semantics for this new class of queries and a query evaluation framework permitting optimizations in a principled way. We further analyze the runtime complexity of query evaluation using this model and develop a suite of techniques that improve runtime efficiency by exploiting sharing in storage and processing. Our experimental results provide insights into the various factors on runtime performance and demonstrate the significant performance gains of our sharing techniques.

...read moreread less

441 citations

Proceedings Article•

Cayuga: A General Purpose Event Monitoring System.

[...]

Alan Demers, Johannes Gehrke, Biswanath Panda, Mirek Riedewald, Vivek Sharma, Walker White - Show less +2 more

01 Jan 2007

TL;DR: This work describes the design and implementation of the Cornell Cayuga System for scalable event processing and presents a query language based on Cayuga Algebra for naturally expressing complex event patterns.

...read moreread less

Abstract: We describe the design and implementation of the Cornell Cayuga System for scalable event processing. We present a query language based on Cayuga Algebra for naturally expressing complex event patterns. We also describe several novel system design and implementation issues, focusing on Cayuga’s query processor, its indexing approach, how Cayuga handles simultaneous events, and its specialized garbage collector.

...read moreread less

393 citations

Cites background from "High-performance complex event proc..."

...Complex event systems such as SNOOP [1], ODE [11] and SASE [18] are closest in spirit to our own work....
[...]

Journal Article•DOI•

A catalog of stream processing optimizations

[...]

Martin Hirzel¹, Robert Soulé², Scott Schneider¹, Bugra Gedik³, Robert Grimm⁴ - Show less +1 more•Institutions (4)

IBM¹, University of Lugano², Bilkent University³, New York University⁴

01 Mar 2014-ACM Computing Surveys

TL;DR: A survey of optimizations for stream processing, in a style similar to catalogs of design patterns or refactorings, to help future streaming system builders to stand on the shoulders of giants from not just their own community.

...read moreread less

Abstract: Various research communities have independently arrived at stream processing as a programming model for efficient and parallel computing. These communities include digital signal processing, databases, operating systems, and complex event processing. Since each community faces applications with challenging performance requirements, each of them has developed some of the same optimizations, but often with conflicting terminology and unstated assumptions. This article presents a survey of optimizations for stream processing. It is aimed both at users who need to understand and guide the system’s optimizer and at implementers who need to make engineering tradeoffs. To consolidate terminology, this article is organized as a catalog, in a style similar to catalogs of design patterns or refactorings. To make assumptions explicit and help understand tradeoffs, each optimization is presented with its safety constraints (when does it preserve correctnessq) and a profitability experiment (when does it improve performanceq). We hope that this survey will help future streaming system builders to stand on the shoulders of giants from not just their own community.

...read moreread less

314 citations

Cites methods or result from "High-performance complex event proc..."

...An example is SASE [Wu et al. 2006]....
[...]
...A related approach is SASE, which can fuse certain operators with the source operator and then implement these operators by a different algorithm [Wu et al. 2006]....
[...]
...For instance, when SASE fuses a source operator that reads input data with a downstream operator, it combines them such that the downstream operator is piggy-backed incrementally on the source operator, producing fewer intermediate results [Wu et al. 2006]....
[...]
...For example, Galax uses nested-relational algebra for XML processing [Ré et al. 2006], and SASE uses a custom algebra for finding temporal patterns across sequences of data items [Wu et al. 2006]....
[...]
...2006], and SASE uses a custom algebra for finding temporal patterns across sequences of data items [Wu et al. 2006]....
[...]

Collapse

High-performance complex event processing over streams

Citations

Cites background from "High-performance complex event proc..."

Cites background from "High-performance complex event proc..."

Cites background from "High-performance complex event proc..."

Cites methods or result from "High-performance complex event proc..."

References

"High-performance complex event proc..." refers background or methods in this paper

"High-performance complex event proc..." refers methods in this paper

"High-performance complex event proc..." refers background or methods in this paper

Related Papers (5)