Home
/
Authors
/
Shariq Rizvi

Author

Shariq Rizvi

Other affiliations: Google

Bio: Shariq Rizvi is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Wireless sensor network & Complex event processing. The author has an hindex of 7, co-authored 7 publications receiving 1733 citations. Previous affiliations of Shariq Rizvi include Google.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

High-performance complex event processing over streams

[...]

Eugene Wu¹, Yanlei Diao², Shariq Rizvi³•Institutions (3)

University of California, Berkeley¹, University of Massachusetts Amherst², Google³

27 Jun 2006

TL;DR: This paper proposes a complex event language that significantly extends existing event languages to meet the needs of a range of RFID-enabled monitoring applications and describes a query plan-based approach to efficiently implementing this language.

...read moreread less

Abstract: In this paper, we present the design, implementation, and evaluation of a system that executes complex event queries over real-time streams of RFID readings encoded as events. These complex event queries filter and correlate events to match specific patterns, and transform the relevant events into new composite events for the use of external monitoring applications. Stream-based execution of these queries enables time-critical actions to be taken in environments such as supply chain management, surveillance and facility management, healthcare, etc. We first propose a complex event language that significantly extends existing event languages to meet the needs of a range of RFID-enabled monitoring applications. We then describe a query plan-based approach to efficiently implementing this language. Our approach uses native operators to efficiently handle query-defined sequences, which are a key component of complex event processing, and pipeline such sequences to subsequent operators that are built by leveraging relational techniques. We also develop a large suite of optimization techniques to address challenges such as large sliding windows and intermediate result sizes. We demonstrate the effectiveness of our approach through a detailed performance analysis of our prototype implementation under a range of data and query workloads as well as through a comparison to a state-of-the-art stream processor.

...read moreread less

902 citations

Proceedings Article•DOI•

Extending query rewriting techniques for fine-grained access control

[...]

Shariq Rizvi¹, Alberto O. Mendelzon², Sundararajarao Sudarshan³, Prasan Roy⁴•Institutions (4)

University of California, Berkeley¹, University of Toronto², Indian Institute of Technology Bombay³, IBM⁴

13 Jun 2004

TL;DR: In this paper, a fine-grained access control model based on authorization views is presented, where user queries can be phrased in terms of the database relations, and are valid if they can be answered using only the information contained in these authorization views.

...read moreread less

Abstract: Current day database applications, with large numbers of users, require fine-grained access control mechanisms, at the level of individual tuples, not just entire relations/views, to control which parts of the data can be accessed by each user. Fine-grained access control is often enforced in the application code, which has numerous drawbacks; these can be avoided by specifying/enforcing access control at the database level. We present a novel fine-grained access control model based on authorization views that allows "authorization-transparent" querying; that is, user queries can be phrased in terms of the database relations, and are valid if they can be answered using only the information contained in these authorization views. We extend earlier work on authorization-transparent querying by introducing a new notion of validity, conditional validity. We give a powerful set of inference rules to check for query validity. We demonstrate the practicality of our techniques by describing how an existing query optimizer can be extended to perform access control checks by incorporating these inference rules.

...read moreread less

371 citations

Proceedings Article•

Design Considerations for High Fan-In Systems: The HiFi Approach.

[...]

Michael J. Franklin, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu, Owen Cooper, Anil Edakkunni, Wei Hong - Show less +5 more

01 Jan 2005

TL;DR: This paper identifies the key characteristics and data management challenges presented by high fan-in systems, and argues for a uniform, query-based approach towards addressing them, and presents the initial design concepts behind HiFi.

...read moreread less

Abstract: Advances in data acquisition and sensor technologies are leading towards the development of “high fan-in” architectures: widely distributed systems whose edges consist of numerous receptors such as sensor networks, RFID readers, or probes, and whose interior nodes are traditional host computers organized using the principles of cascading streams and successive aggregation. Examples include RFID-enabled supply chain management, largescale environmental monitoring, and various types of network and computing infrastructure monitoring. In this paper, we identify the key characteristics and data management challenges presented by high fan-in systems, and argue for a uniform, query-based approach towards addressing them. We then present our initial design concepts behind HiFi, the system we are building to embody these ideas, and describe a proof-of-concept prototype.

...read moreread less

221 citations

Book Chapter•DOI•

Towards an internet-scale XML dissemination service

[...]

Yanlei Diao¹, Shariq Rizvi¹, Michael J. Franklin¹•Institutions (1)

University of California, Berkeley¹

31 Aug 2004

TL;DR: This paper identifies the salient technical challenges in supporting XML filtering and transformation in this environment and proposes techniques for solving them and presents the architectural design of ONYX, a system based on an overlay network.

...read moreread less

Abstract: Publish/subscribe systems have demonstrated the ability to scale to large numbers of users and high data rates when providing content-based data dissemination services on the Internet. However, their services are limited by the data semantics and query expressiveness that they support. On the other hand, the recent work on selective dissemination of XML data has made significant progress in moving from XML filtering to the richer functionality of transformation for result customization, but in general has ignored the challenges of deploying such XML-based services on an Internet-scale. In this paper, we address these challenges in the context of incorporating the rich functionality of XML data dissemination in a highly scalable system. We present the architectural design of ONYX, a system based on an overlay network. We identify the salient technical challenges in supporting XML filtering and transformation in this environment and propose techniques for solving them.

...read moreread less

191 citations

Proceedings Article•DOI•

Events on the edge

[...]

Shariq Rizvi¹, Shawn R. Jeffery¹, Sailesh Krishnamurthy¹, Michael J. Franklin¹, Nathan Burkhart¹, Anil Edakkunni¹, Linus Liang¹ - Show less +3 more•Institutions (1)

University of California, Berkeley¹

14 Jun 2005

TL;DR: This work shows how HiFi generates simple events out of receptor data at its edges and provides high-functionality complex event processing mechanisms for sophisticated event detection using a real-world library scenario.

...read moreread less

Abstract: The emergence of large-scale receptor-based systems has enabled applications to execute complex business logic over data generated from monitoring the physical world. An important functionality required by these applications is the detection and response to complex events, often in real-time. Bridging the gap between low-level receptor technology and such high-level needs of applications remains a significant challenge.We demonstrate our solution to this problem in the context of HiFi, a system we are building to solve the data management problems of large-scale receptor-based systems. Specifically, we show how HiFi generates simple events out of receptor data at its edges and provides high-functionality complex event processing mechanisms for sophisticated event detection using a real-world library scenario.

...read moreread less

45 citations

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

CryptDB: protecting confidentiality with encrypted query processing

[...]

Raluca Ada Popa¹, Catherine M. S. Redfield¹, Nickolai Zeldovich¹, Hari Balakrishnan¹•Institutions (1)

Massachusetts Institute of Technology¹

23 Oct 2011

TL;DR: The evaluation shows that CryptDB has low overhead, reducing throughput by 14.5% for phpBB, a web forum application, and by 26% for queries from TPC-C, compared to unmodified MySQL.

...read moreread less

Abstract: Online applications are vulnerable to theft of sensitive information because adversaries can exploit software bugs to gain access to private data, and because curious or malicious administrators may capture and leak data. CryptDB is a system that provides practical and provable confidentiality in the face of these attacks for applications backed by SQL databases. It works by executing SQL queries over encrypted data using a collection of efficient SQL-aware encryption schemes. CryptDB can also chain encryption keys to user passwords, so that a data item can be decrypted only by using the password of one of the users with access to that data. As a result, a database administrator never gets access to decrypted data, and even if all servers are compromised, an adversary cannot decrypt the data of any user who is not logged in. An analysis of a trace of 126 million SQL queries from a production MySQL server shows that CryptDB can support operations over encrypted data for 99.5% of the 128,840 columns seen in the trace. Our evaluation shows that CryptDB has low overhead, reducing throughput by 14.5% for phpBB, a web forum application, and by 26% for queries from TPC-C, compared to unmodified MySQL. Chaining encryption keys to user passwords requires 11--13 unique schema annotations to secure more than 20 sensitive fields and 2--7 lines of source code changes for three multi-user web applications.

...read moreread less

1,269 citations

Journal Article•DOI•

Toward Scalable Systems for Big Data Analytics: A Technology Tutorial

[...]

Han Hu¹, Yonggang Wen², Tat-Seng Chua¹, Xuelong Li³•Institutions (3)

National University of Singapore¹, Nanyang Technological University², Chinese Academy of Sciences³

24 Jun 2014-IEEE Access

TL;DR: This paper presents a systematic framework to decompose big data systems into four sequential modules, namely data generation, data acquisition, data storage, and data analytics, and presents the prevalent Hadoop framework for addressing big data challenges.

...read moreread less

Abstract: Recent technological advancements have led to a deluge of data from distinctive domains (e.g., health care and scientific sensors, user-generated data, Internet and financial companies, and supply chain systems) over the past two decades. The term big data was coined to capture the meaning of this emerging trend. In addition to its sheer volume, big data also exhibits other unique characteristics as compared with traditional data. For instance, big data is commonly unstructured and require more real-time analysis. This development calls for new system architectures for data acquisition, transmission, storage, and large-scale data processing mechanisms. In this paper, we present a literature survey and system tutorial for big data analytics platforms, aiming to provide an overall picture for nonexpert readers and instill a do-it-yourself spirit for advanced audiences to customize their own big-data solutions. First, we present the definition of big data and discuss big data challenges. Next, we present a systematic framework to decompose big data systems into four sequential modules, namely data generation, data acquisition, data storage, and data analytics. These four modules form a big data value chain. Following that, we present a detailed survey of numerous approaches and mechanisms from research and industry communities. In addition, we present the prevalent Hadoop framework for addressing big data challenges. Finally, we outline several evaluation benchmarks and potential research directions for big data systems.

...read moreread less

1,002 citations

Journal Article•DOI•

Processing flows of information: From data stream to complex event processing

[...]

Gianpaolo Cugola¹, Alessandro Margara¹•Institutions (1)

Polytechnic University of Milan¹

14 Jun 2012-ACM Computing Surveys

TL;DR: A general, unifying model is proposed to capture the different aspects of an IFP system and use it to provide a complete and precise classification of the systems and mechanisms proposed so far.

...read moreread less

Abstract: A large number of distributed applications requires continuous and timely processing of information as it flows from the periphery to the center of the system. Examples include intrusion detection systems which analyze network traffic in real-time to identify possible attacks; environmental monitoring applications which process raw data coming from sensor networks to identify critical situations; or applications performing online analysis of stock prices to identify trends and forecast future values.Traditional DBMSs, which need to store and index data before processing it, can hardly fulfill the requirements of timeliness coming from such domains. Accordingly, during the last decade, different research communities developed a number of tools, which we collectively call Information flow processing (IFP) systems, to support these scenarios. They differ in their system architecture, data model, rule model, and rule language. In this article, we survey these systems to help researchers, who often come from different backgrounds, in understanding how the various approaches they adopt may complement each other.In particular, we propose a general, unifying model to capture the different aspects of an IFP system and use it to provide a complete and precise classification of the systems and mechanisms proposed so far.

...read moreread less

918 citations

Proceedings Article•DOI•

High-performance complex event processing over streams

[...]

Eugene Wu¹, Yanlei Diao², Shariq Rizvi³•Institutions (3)

University of California, Berkeley¹, University of Massachusetts Amherst², Google³

27 Jun 2006

...read moreread less

902 citations

Proceedings Article•DOI•

Approximate Data Collection in Sensor Networks using Probabilistic Models

[...]

David Chu¹, Amol Deshpande², Joseph M. Hellerstein¹, Wei Hong•Institutions (2)

University of California, Berkeley¹, University of Maryland, College Park²

03 Apr 2006

TL;DR: This paper proposes a robust approximate technique called Ken that uses replicated dynamic probabilistic models to minimize communication from sensor nodes to the network’s PC base station, and shows that Ken is well suited to anomaly- and event-detection applications.

...read moreread less

Abstract: Wireless sensor networks are proving to be useful in a variety of settings. A core challenge in these networks is to minimize energy consumption. Prior database research has proposed to achieve this by pushing data-reducing operators like aggregation and selection down into the network. This approach has proven unpopular with early adopters of sensor network technology, who typically want to extract complete "dumps" of the sensor readings, i.e., to run "SELECT *" queries. Unfortunately, because these queries do no data reduction, they consume significant energy in current sensornet query processors. In this paper we attack the "SELECT " problem for sensor networks. We propose a robust approximate technique called Ken that uses replicated dynamic probabilistic models to minimize communication from sensor nodes to the networks PC base station. In addition to data collection, we show that Ken is well suited to anomaly- and event-detection applications. A key challenge in this work is to intelligently exploit spatial correlations across sensor nodes without imposing undue sensor-to-sensor communication burdens to maintain the models. Using traces from two real-world sensor network deployments, we demonstrate that relatively simple models can provide significant communication (and hence energy) savings without undue sacrifice in result quality or frequency. Choosing optimally among even our simple models is NPhard, but our experiments show that a greedy heuristic performs nearly as well as an exhaustive algorithm.

...read moreread less

504 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse