TelegraphCQ: continuous dataflow processing

doi:10.1145/872757.872857

Home
/
Papers
/
TelegraphCQ: continuous dataflow processing

Proceedings Article•DOI•

TelegraphCQ: continuous dataflow processing

Sirish Chandrasekaran¹, Owen Cooper¹, Amol Deshpande¹, Michael J. Franklin¹, Joseph M. Hellerstein¹, Wei Hong², Sailesh Krishnamurthy¹, Samuel Madden¹, Fred Reiss¹, Mehul A. Shah¹ - Show less +6 more•Institutions (2)

University of California, Berkeley¹, Intel²

09 Jun 2003-pp 668-668

TL;DR: The current version of TelegraphCQ is shown, which is implemented by leveraging the code base of the open source PostgreSQL database system, which found that a significant portion of the PostgreSQL code was easily reusable.

read less

Abstract: At Berkeley, we are developing TelegraphCQ [1, 2], a dataflow system for processing continuous queries over data streams. TelegraphCQ is based on a novel, highly-adaptive architecture supporting dynamic query workloads in volatile data streaming environments. In this demonstration we show our current version of TelegraphCQ, which we implemented by leveraging the code base of the open source PostgreSQL database system. Although TelegraphCQ differs significantly from a traditional database system, we found that a significant portion of the PostgreSQL code was easily reusable. We also found the extensibility features of PostgreSQL very useful, particularly its rich data types and the ability to load user-developed functions. Challenges: As discussed in [1], sharing and adaptivity are our main techniques for implementing a continuous query system. Doing this in the codebase of a conventional database posed a number of challenges:

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•

The Design of the Borealis Stream Processing Engine

[...]

Daniel J. Abadi¹, Yanif Ahmad², Magdalena Balazinska¹, Mitch Cherniack³, Jeong-Hyon Hwang², Wolfgang Lindner¹, Anurag S. Maskey³, Alexander Rasin², Esther Ryvkina³, Nesime Tatbul², Ying Xing², Stan Zdonik² - Show less +8 more•Institutions (3)

Massachusetts Institute of Technology¹, Brown University², Brandeis University³

01 Jan 2005

TL;DR: This paper outlines the basic design and functionality of Borealis, and presents a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.

...read moreread less

Abstract: Borealis is a second-generation distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionality from Aurora [14] and distribution functionality from Medusa [51]. Borealis modifies and extends both systems in non-trivial and critical ways to provide advanced capabilities that are commonly required by newly-emerging stream processing applications. In this paper, we outline the basic design and functionality of Borealis. Through sample real-world applications, we motivate the need for dynamically revising query results and modifying query specifications. We then describe how Borealis addresses these challenges through an innovative set of features, including revision records, time travel, and control lines. Finally, we present a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.

...read moreread less

1,533 citations

Cites background from "TelegraphCQ: continuous dataflow pr..."

...Several groups have developed working prototypes [1, 4, 16] and many papers have been published on detailed aspects of the technology such as data models [2, 5, 46], scheduling [8, 15], and load shedding [9, 20, 44]....
[...]

Journal Article•DOI•

The CQL continuous query language: semantic foundations and query execution

[...]

Arvind Arasu¹, Shivnath Babu¹, Jennifer Widom¹•Institutions (1)

Stanford University¹

01 Jun 2006

TL;DR: This paper presents the structure of CQL's query execution plans as well as details of the most important components: operators, interoperator queues, synopses, and sharing of components among multiple operators and queries.

...read moreread less

Abstract: CQL, a continuous query language, is supported by the STREAM prototype data stream management system (DSMS) at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and stored relations. We begin by presenting an abstract semantics that relies only on “black-box” mappings among streams and relations. From these mappings we define a precise and general interpretation for continuous queries. CQL is an instantiation of our abstract semantics using SQL to map from relations to relations, window specifications derived from SQL-99 to map from streams to relations, and three new operators to map from relations to streams. Most of the CQL language is operational in the STREAM system. We present the structure of CQL's query execution plans as well as details of the most important components: operators, interoperator queues, synopses, and sharing of components among multiple operators and queries. Examples throughout the paper are drawn from the Linear Road benchmark recently proposed for DSMSs. We also curate a public repository of data stream applications that includes a wide variety of queries expressed in CQL. The relative ease of capturing these applications in CQL is one indicator that the language contains an appropriate set of constructs for data stream processing.

...read moreread less

1,235 citations

Journal Article•

The CQL Continuous Query Language : Semantic Foundations and Query Execution

[...]

Arvind Arasu

01 Jan 2003-CTIT technical reports series

1,115 citations

Journal Article•DOI•

Issues in data stream management

[...]

Lukasz Golab¹, M. Tamer Özsu¹•Institutions (1)

University of Waterloo¹

01 Jun 2003

TL;DR: The purpose of this paper is to review recent work in data stream management systems, with an emphasis on application requirements, data models, continuous query languages, and query evaluation.

...read moreread less

Abstract: Traditional databases store sets of relatively static records with no pre-defined notion of time, unless timestamp attributes are explicitly added. While this model adequately represents commercial catalogues or repositories of personal information, many current and emerging applications require support for on-line analysis of rapidly changing data streams. Limitations of traditional DBMSs in supporting streaming applications have been recognized, prompting research to augment existing technologies and build new systems to manage streaming data. The purpose of this paper is to review recent work in data stream management systems, with an emphasis on application requirements, data models, continuous query languages, and query evaluation.

...read moreread less

1,068 citations

Cites background or methods from "TelegraphCQ: continuous dataflow pr..."

...Designing disk-based data structures and indices to exploit access patterns of stream archives is an open problem [12]....
[...]
...Three proposed relation-based languages are CQL [3, 56], StreaQuel [12, 14], and AQuery [47], each of which has SQL-like syntax and enhanced support for windows and ordering....
[...]
...• TelegraphCQ [12] is a continuous query processing system that focuses on shared query evaluation and adaptive query processing http://telegraph....
[...]

Book Chapter•DOI•

C-store: a column-oriented DBMS

[...]

Michael Stonebraker¹, Daniel J. Abadi¹, Adam Batkin², Xuedong Chen³, Mitch Cherniack², Miguel Ferreira¹, Edmond Lau¹, Amerson Lin¹, Samuel Madden¹, Elizabeth O'Neil³, Patrick O'Neil³, Alexander Rasin⁴, Nga Tran², Stan Zdonik⁴ - Show less +10 more•Institutions (4)

Massachusetts Institute of Technology¹, Brandeis University², University of Massachusetts Boston³, Brown University⁴

01 Dec 2018

TL;DR: Preliminary performance data on a subset of TPC-H is presented and it is shown that the system the team is building, C-Store, is substantially faster than popular commercial products.

...read moreread less

Abstract: This paper presents the design of a read-optimized relational DBMS that contrasts sharply with most current systems, which are write-optimized. Among the many differences in its design are: storage of data by column rather than by row, careful coding and packing of objects into storage including main memory during query processing, storing an overlapping collection of column-oriented projections, rather than the current fare of tables and indexes, a non-traditional implementation of transactions which includes high availability and snapshot isolation for read-only transactions, and the extensive use of bitmap indexes to complement B-tree structures.We present preliminary performance data on a subset of TPC-H and show that the system we are building, C-Store, is substantially faster than popular commercial products. Hence, the architecture looks very encouraging.

...read moreread less

1,063 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154

Collapse

References

PDF

Open Access

More filters

Proceedings Article•

TelegraphCQ: Continuous Dataflow Processing for an Uncertain World.

[...]

Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong, Sailesh Krishnamurthy, Samuel Madden, Vijayshankar Raman, Frederick Reiss, Mehul A. Shah - Show less +7 more

01 Jan 2003

TL;DR: The next generation Telegraph system, called TelegraphCQ, is focused on meeting the challenges that arise in handling large streams of continuous queries over high-volume, highly-variable data streams and leverages the PostgreSQL open source code base.

...read moreread less

Abstract: Increasingly pervasive networks are leading towards a world where data is constantly in motion. In such a world, conventional techniques for query processing, which were developed under the assumption of a far more static and predictable computational environment, will not be sufficient. Instead, query processors based on adaptive dataflow will be necessary. The Telegraph project has developed a suite of novel technologies for continuously adaptive query processing. The next generation Telegraph system, called TelegraphCQ, is focused on meeting the challenges that arise in handling large streams of continuous queries over high-volume, highly-variable data streams. In this paper, we describe the system architecture and its underlying technology, and report on our ongoing implementation effort, which leverages the PostgreSQL open source code base. We also discuss open issues and our research agenda.

...read moreread less

1,248 citations

Journal Article•

TelegraphCQ: An Architectural Status Report

[...]

Sailesh Krishnamurthy¹, Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong, Samuel Madden, Frederick Reiss, Mehul A. Shah - Show less +6 more•Institutions (1)

University of California, Berkeley¹

01 Jan 2003-IEEE Data(base) Engineering Bulletin

TL;DR: The experiences of extending a traditional DBMS towards managing data streams, and an overview of the current early-access release of the TelegraphCQ system are described.

...read moreread less

Abstract: We are building TelegraphCQ, a system to process continuous queries over data streams. Although we had implemented some parts of this technology in earlier Java-based prototypes, our experiences were not positive. As a result, we decided to use PostgreSQL, an open source RDBMS as a starting point for our new implementation. In March 2003, we completed an alpha milestone of TelegraphCQ. In this paper, we report on the development status of our project, with a focus on architectural issues. Specifically, we describe our experiences extending a traditional DBMS towards managing data streams, and an overview of the current early-access release of the system.

...read moreread less

105 citations