Home
/
Authors
/
Sailesh Krishnamurthy

Author

Sailesh Krishnamurthy

Other affiliations: Cisco Systems, Inc., Raman Research Institute, Amazon.com ...read more

Bio: Sailesh Krishnamurthy is an academic researcher from Google. The author has contributed to research in topics: Scalability & Cache. The author has an hindex of 20, co-authored 25 publications receiving 3573 citations. Previous affiliations of Sailesh Krishnamurthy include Cisco Systems, Inc. & Raman Research Institute.

Topics: Scalability, Cache, Smart Cache, Wireless sensor network, Database caching ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•

TelegraphCQ: Continuous Dataflow Processing for an Uncertain World.

[...]

Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong, Sailesh Krishnamurthy, Samuel Madden, Vijayshankar Raman, Frederick Reiss, Mehul A. Shah - Show less +7 more

01 Jan 2003

TL;DR: The next generation Telegraph system, called TelegraphCQ, is focused on meeting the challenges that arise in handling large streams of continuous queries over high-volume, highly-variable data streams and leverages the PostgreSQL open source code base.

...read moreread less

Abstract: Increasingly pervasive networks are leading towards a world where data is constantly in motion. In such a world, conventional techniques for query processing, which were developed under the assumption of a far more static and predictable computational environment, will not be sufficient. Instead, query processors based on adaptive dataflow will be necessary. The Telegraph project has developed a suite of novel technologies for continuously adaptive query processing. The next generation Telegraph system, called TelegraphCQ, is focused on meeting the challenges that arise in handling large streams of continuous queries over high-volume, highly-variable data streams. In this paper, we describe the system architecture and its underlying technology, and report on our ongoing implementation effort, which leverages the PostgreSQL open source code base. We also discuss open issues and our research agenda.

...read moreread less

1,248 citations

Proceedings Article•DOI•

TelegraphCQ: continuous dataflow processing

[...]

Sirish Chandrasekaran¹, Owen Cooper¹, Amol Deshpande¹, Michael J. Franklin¹, Joseph M. Hellerstein¹, Wei Hong², Sailesh Krishnamurthy¹, Samuel Madden¹, Fred Reiss¹, Mehul A. Shah¹ - Show less +6 more•Institutions (2)

University of California, Berkeley¹, Intel²

09 Jun 2003

TL;DR: The current version of TelegraphCQ is shown, which is implemented by leveraging the code base of the open source PostgreSQL database system, which found that a significant portion of the PostgreSQL code was easily reusable.

...read moreread less

Abstract: At Berkeley, we are developing TelegraphCQ [1, 2], a dataflow system for processing continuous queries over data streams. TelegraphCQ is based on a novel, highly-adaptive architecture supporting dynamic query workloads in volatile data streaming environments. In this demonstration we show our current version of TelegraphCQ, which we implemented by leveraging the code base of the open source PostgreSQL database system. Although TelegraphCQ differs significantly from a traditional database system, we found that a significant portion of the PostgreSQL code was easily reusable. We also found the extensibility features of PostgreSQL very useful, particularly its rich data types and the ability to load user-developed functions. Challenges: As discussed in [1], sharing and adaptivity are our main techniques for implementing a continuous query system. Doing this in the codebase of a conventional database posed a number of challenges:

...read moreread less

767 citations

Proceedings Article•

Design Considerations for High Fan-In Systems: The HiFi Approach.

[...]

Michael J. Franklin, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu, Owen Cooper, Anil Edakkunni, Wei Hong - Show less +5 more

01 Jan 2005

TL;DR: This paper identifies the key characteristics and data management challenges presented by high fan-in systems, and argues for a uniform, query-based approach towards addressing them, and presents the initial design concepts behind HiFi.

...read moreread less

Abstract: Advances in data acquisition and sensor technologies are leading towards the development of “high fan-in” architectures: widely distributed systems whose edges consist of numerous receptors such as sensor networks, RFID readers, or probes, and whose interior nodes are traditional host computers organized using the principles of cascading streams and successive aggregation. Examples include RFID-enabled supply chain management, largescale environmental monitoring, and various types of network and computing infrastructure monitoring. In this paper, we identify the key characteristics and data management challenges presented by high fan-in systems, and argue for a uniform, query-based approach towards addressing them. We then present our initial design concepts behind HiFi, the system we are building to embody these ideas, and describe a proof-of-concept prototype.

...read moreread less

221 citations

Proceedings Article•DOI•

Middle-tier database caching for e-business

[...]

Qiong Luo¹, Sailesh Krishnamurthy², Chandrasekaran Mohan³, Hamid Pirahesh³, Honguk Woo⁴, Bruce G. Lindsay³, Jeffrey F. Naughton¹ - Show less +3 more•Institutions (4)

University of Wisconsin-Madison¹, University of California, Berkeley², IBM³, University of Texas at Austin⁴

03 Jun 2002

TL;DR: A simple extension to the existing federated features in DB2 UDB is presented, which enables a regular DB2 instance to become a DBCache without any application modification, and an extensive set of experiments with an E-Commerce benchmark is conducted to show the benefits of this approach and illustrate tradeoffs in caching considerations.

...read moreread less

Abstract: While scaling up to the enormous and growing Internet population with unpredictable usage patterns, E-commerce applications face severe challenges in cost and manageability, especially for database servers that are deployed as those applications' backends in a multi-tier configuration. Middle-tier database caching is one solution to this problem. In this paper, we present a simple extension to the existing federated features in DB2 UDB, which enables a regular DB2 instance to become a DBCache without any application modification. On deployment of a DBCache at an application server, arbitrary SQL statements generated from the unchanged application that are intended for a backend database server, can be answered: at the cache, at the backend database server, or at both locations in a distributed manner. The factors that determine the distribution of workload include the SQL statement type, the cache content, the application requirement on data freshness, and cost-based optimization at the cache. We have developed a research prototype of DBCache, and conducted an extensive set of experiments with an E-Commerce benchmark to show the benefits of this approach and illustrate tradeoffs in caching considerations.

...read moreread less

209 citations

Proceedings Article•DOI•

On-the-fly sharing for streamed aggregation

[...]

Sailesh Krishnamurthy¹, Chung Wu², Michael J. Franklin¹•Institutions (2)

University of California, Berkeley¹, Google²

27 Jun 2006

TL;DR: A major contribution is the sharing technique that does not require any up-front multiple query optimization, a significant departure from existing techniques that rely on complex static analyses of fixed query workloads.

...read moreread less

Abstract: Data streaming systems are becoming essential for monitoring applications such as financial analysis and network intrusion detection. These systems often have to process many similar but different queries over common data. Since executing each query separately can lead to significant scalability and performance problems, it is vital to share resources by exploiting similarities in the queries. In this paper we present ways to efficiently share streaming aggregate queries with differing periodic windows and arbitrary selection predicates. A major contribution is our sharing technique that does not require any up-front multiple query optimization. This is a significant departure from existing techniques that rely on complex static analyses of fixed query workloads. Our approach is particularly vital in streaming systems where queries can join and leave the system at any point. We present a detailed performance study that evaluates our strategies with an implementation and real data. In these experiments, our approach gives us as much as an order of magnitude performance improvement over the state of the art.

...read moreread less

206 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

I and i

[...]

Kevin Barraclough

08 Dec 2001-BMJ

TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.

...read moreread less

Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

...read moreread less

33,785 citations

Journal Article•DOI•

TinyDB: an acquisitional query processing system for sensor networks

[...]

Samuel Madden¹, Michael J. Franklin², Joseph M. Hellerstein², Wei Hong³•Institutions (3)

Massachusetts Institute of Technology¹, University of California, Berkeley², Intel³

01 Mar 2005

TL;DR: This work evaluates issues in the context of TinyDB, a distributed query processor for smart sensor devices, and shows how acquisitional techniques can provide significant reductions in power consumption on the authors' sensor devices.

...read moreread less

Abstract: We discuss the design of an acquisitional query processor for data collection in sensor networks. Acquisitional issues are those that pertain to where, when, and how often data is physically acquired (sampled) and delivered to query processing operators. By focusing on the locations and costs of acquiring data, we are able to significantly reduce power consumption over traditional passive systems that assume the a priori existence of data. We discuss simple extensions to SQL for controlling data acquisition, and show how acquisitional issues influence query optimization, dissemination, and execution. We evaluate these issues in the context of TinyDB, a distributed query processor for smart sensor devices, and show how acquisitional techniques can provide significant reductions in power consumption on our sensor devices.

...read moreread less

2,065 citations

Proceedings Article•DOI•

Mining time-changing data streams

[...]

Geoff Hulten¹, Laurie Spencer, Pedro Domingos¹•Institutions (1)

University of Washington¹

26 Aug 2001

TL;DR: An efficient algorithm for mining decision trees from continuously-changing data streams, based on the ultra-fast VFDT decision tree learner is proposed, called CVFDT, which stays current while making the most of old data by growing an alternative subtree whenever an old one becomes questionable, and replacing the old with the new when the new becomes more accurate.

...read moreread less

Abstract: Most statistical and machine-learning algorithms assume that the data is a random sample drawn from a stationary distribution. Unfortunately, most of the large databases available for mining today violate this assumption. They were gathered over months or years, and the underlying processes generating them changed during this time, sometimes radically. Although a number of algorithms have been proposed for learning time-changing concepts, they generally do not scale well to very large databases. In this paper we propose an efficient algorithm for mining decision trees from continuously-changing data streams, based on the ultra-fast VFDT decision tree learner. This algorithm, called CVFDT, stays current while making the most of old data by growing an alternative subtree whenever an old one becomes questionable, and replacing the old with the new when the new becomes more accurate. CVFDT learns a model which is similar in accuracy to the one that would be learned by reapplying VFDT to a moving window of examples every time a new example arrives, but with O(1) complexity per example, as opposed to O(w), where w is the size of the window. Experiments on a set of large time-changing data streams demonstrate the utility of this approach.

...read moreread less

1,790 citations

Journal Article•DOI•

Data streams: algorithms and applications

[...]

S. Muthukrishnan¹•Institutions (1)

Rutgers University¹

01 Aug 2005-Foundations and Trends in Theoretical Computer Science

TL;DR: Data Streams: Algorithms and Applications surveys the emerging area of algorithms for processing data streams and associated applications, which rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity.

...read moreread less

Abstract: In the data stream scenario, input arrives very rapidly and there is limited memory to store the input. Algorithms have to work with one or few passes over the data, space less than linear in the input size or time significantly less than the input size. In the past few years, a new theory has emerged for reasoning about algorithms that work within these constraints on space, time, and number of passes. Some of the methods rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity. The applications for this scenario include IP network traffic analysis, mining text message streams and processing massive data sets in general. Researchers in Theoretical Computer Science, Databases, IP Networking and Computer Systems are working on the data stream challenges. This article is an overview and survey of data stream algorithmics and is an updated version of [1].

...read moreread less

1,598 citations

Proceedings Article•

The Design of the Borealis Stream Processing Engine

[...]

Daniel J. Abadi¹, Yanif Ahmad², Magdalena Balazinska¹, Mitch Cherniack³, Jeong-Hyon Hwang², Wolfgang Lindner¹, Anurag S. Maskey³, Alexander Rasin², Esther Ryvkina³, Nesime Tatbul², Ying Xing², Stan Zdonik² - Show less +8 more•Institutions (3)

Massachusetts Institute of Technology¹, Brown University², Brandeis University³

01 Jan 2005

TL;DR: This paper outlines the basic design and functionality of Borealis, and presents a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.

...read moreread less

Abstract: Borealis is a second-generation distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionality from Aurora [14] and distribution functionality from Medusa [51]. Borealis modifies and extends both systems in non-trivial and critical ways to provide advanced capabilities that are commonly required by newly-emerging stream processing applications. In this paper, we outline the basic design and functionality of Borealis. Through sample real-world applications, we motivate the need for dynamically revising query results and modifying query specifications. We then describe how Borealis addresses these challenges through an innovative set of features, including revision records, time travel, and control lines. Finally, we present a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.

...read moreread less

1,533 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse