Papers published on a yearly basis
Papers
More filters
••
TL;DR: This article presents DynaMat, a system that manages dynamic collections of materialized aggregate views in a data warehouse, and shows how to derive an efficient update plan with respect to the available maintenance window, the different update policies for the views and the dependencies that exist among them.
Abstract: Materialized aggregate views represent a set of redundant entities in a data warehouse that are frequently used to accelerate On-Line Analytical Processing (OLAP). Due to the complex structure of the data warehouse and the different profiles of the users who submit queries, there is need for tools that will automate and ease the view selection and management processes. In this article we present DynaMat, a system that manages dynamic collections of materialized aggregate views in a data warehouse. At query time, DynaMat utilizes a dedicated disk space for storing computed aggregates that are further engaged for answering new queries. Queries are executed independently or can be bundled within a multiquery expression. In the latter case, we present an execution mechanism that exploits dependencies among the queries and the materialized set to further optimize their execution. During updates, DynaMat reconciles the current materialized view selection and refreshes the most beneficial subset of it within a given maintenance window. We show how to derive an efficient update plan with respect to the available maintenance window, the different update policies for the views and the dependencies that exist among them.
93 citations
••
04 Nov 2009TL;DR: Novel techniques for analyzing unidirectional TCP flows are developed, including a technique for inferring ICW size, a method for detecting irregular retransmissions, and a new approach for accurately extracting flow clocks.
Abstract: Since the last in-depth studies of measured TCP traffic some 6-8 years ago, the Internet has experienced significant changes, including the rapid deployment of backbone links with 1-2 orders of magnitude more capacity, the emergence of bandwidth-intensive streaming applications, and the massive penetration of new TCP variants. These and other changes beg the question whether the characteristics of measured TCP traffic in today's Internet reflect these changes or have largely remained the same. To answer this question, we collected and analyzed packet traces from a number of Internet backbone and access links, focused on the "heavy-hitter" flows responsible for the majority of traffic. Next we analyzed their within-flow packet dynamics, and observed the following features: (1) in one of our datasets, up to 15.8% of flows have an initial congestion window (ICW) size larger than the upper bound specified by RFC 3390. (2) Among flows that encounter retransmission rates of more than 10%, 5% of them exhibit irregular retransmission behavior where the sender does not slow down its sending rate during retransmissions. (3) TCP flow clocking (i.e., regular spacing between flights of packets) can be caused by both RTT and non-RTT factors such as application or link layer, and 60% of flows studied show no pronounced flow clocking. To arrive at these findings, we developed novel techniques for analyzing unidirectional TCP flows, including a technique for inferring ICW size, a method for detecting irregular retransmissions, and a new approach for accurately extracting flow clocks.
93 citations
•
02 Jun 2010TL;DR: This lecture gives an overview of recent research in stream processing, ranging from answering simple queries on high-speed streams to loading real-time data feeds into a streaming warehouse for off-line analysis.
Abstract: In this lecture many applications process high volumes of streaming data, among them Internet traffic analysis, financial tickers, and transaction log mining. In general, a data stream is an unbounded data set that is produced incrementally over time, rather than being available in full before its processing begins. In this lecture, we give an overview of recent research in stream processing, ranging from answering simple queries on high-speed streams to loading real-time data feeds into a streaming warehouse for off-line analysis. We will discuss two types of systems for end-to-end stream processing: Data Stream Management Systems (DSMSs) and Streaming Data Warehouses (SDWs). A traditional database management system typically processes a stream of ad-hoc queries over relatively static data. In contrast, a DSMS evaluates static (long-running) queries on streaming data, making a single pass over the data and using limited working memory. In the first part of this lecture, we will discuss research problems in DSMSs, such as continuous query languages, non-blocking query operators that continually react to new data, and continuous query optimization. The second part covers SDWs, which combine the real-time response of a DSMS by loading new data as soon as they arrive with a data warehouse's ability to manage Terabytes of historical data on secondary storage. Table of Contents: Introduction / Data Stream Management Systems / Streaming Data Warehouses / Conclusions
93 citations
••
TL;DR: Associating online buddies with musical notes, Hubbub lets users (on both PCs and handhelds) interact by way of opportunistic impromptu exchanges, even as they move about.
Abstract: Associating online buddies with musical notes, Hubbub lets users (on both PCs and handhelds) interact by way of opportunistic impromptu exchanges, even as they move about.
92 citations
••
25 Aug 2003TL;DR: This work identifies the important dimensions of this design space and characterizes some of the inherent design trade-offs in BGP, and attempts to do this in a general way that is not constrained by the details of BGP.
Abstract: BGP is unique among IP-routing protocols in that routing is determined using semantically rich routing policies. However, this expressiveness has come with hidden risks. The interaction of locally defined routing policies can lead to unexpected global routing anomalies, which can be very difficult to identify and correct in the decentralized and competitive Internet environment. These risks increase as the complexity of local policies increase, which is precisely the current trend. BGP policy languages have evolved in a rather organic fashion with little effort to avoid policy-interaction problems. We believe that researchers should start to consider how to emphdesign policy languages for path-vector protocols that avoid such risks and yet retain other desirable features. We take a few steps in this direction by identifying the important dimensions of this design space and characterizing some of the inherent design trade-offs. We attempt to do this in a general way that is not constrained by the details of BGP.
92 citations
Authors
Showing all 1881 results
Name | H-index | Papers | Citations |
---|---|---|---|
Yoshua Bengio | 202 | 1033 | 420313 |
Scott Shenker | 150 | 454 | 118017 |
Paul Shala Henry | 137 | 318 | 35971 |
Peter Stone | 130 | 1229 | 79713 |
Yann LeCun | 121 | 369 | 171211 |
Louis E. Brus | 113 | 347 | 63052 |
Jennifer Rexford | 102 | 394 | 45277 |
Andreas F. Molisch | 96 | 777 | 47530 |
Vern Paxson | 93 | 267 | 48382 |
Lorrie Faith Cranor | 92 | 326 | 28728 |
Ward Whitt | 89 | 424 | 29938 |
Lawrence R. Rabiner | 88 | 378 | 70445 |
Thomas E. Graedel | 86 | 348 | 27860 |
William W. Cohen | 85 | 384 | 31495 |
Michael K. Reiter | 84 | 380 | 30267 |