Papers published on a yearly basis
Papers
More filters
••
TL;DR: The best known deterministic linear space bound is O(√ log n/log log n) due to as discussed by the authors, where n is the number of integer keys in an ordered set.
Abstract: We introduce exponential search trees as a novel technique for converting static polynomial space search structures for ordered sets into fully-dynamic linear space data structures.This leads to an optimal bound of O(√log n/log log n) for searching and updating a dynamic set X of n integer keys in linear space. Searching X for an integer y means finding the maximum key in X which is smaller than or equal to y. This problem is equivalent to the standard text book problem of maintaining an ordered set.The best previous deterministic linear space bound was O(log n/log log n) due to Fredman and Willard from STOC 1990. No better deterministic search bound was known using polynomial space.We also get the following worst-case linear space trade-offs between the number n, the word length W, and the maximal key U
107 citations
•
27 Aug 1998TL;DR: WHIRL as discussed by the authors is an extension of relational databases that can perform "soft joins" based on the similarity of textual identifiers; these soft joins extend the traditional operation of joining tables based on equivalence of atomic values.
Abstract: WHIRL is an extension of relational databases that can perform "soft joins" based on the similarity of textual identifiers; these soft joins extend the traditional operation of joining tables based on the equivalence of atomic values. This paper evaluates WHIRL on a number of inductive classification tasks using data from the World Wide Web. We show that although WHIRL is designed for more general similarity-based reasoning tasks, it is competitive with mature inductive classification systems on these classification tasks. In particular, WHIRL generally achieves lower generalization error than C4.5, RIPPER, and several nearest-neighbor methods. WHIRL is also fast—up to 500 times faster than C4.5 on some benchmark problems. We also show that WHIRL can be efficiently used to select from a large pool of unlabeled items those that can be classified correctly with high confidence.
107 citations
••
TL;DR: The results indicate that, provided sufficient training on a given class, the resulting strategy performs as well as (and, in some cases, better than) the best branching rule for that class.
107 citations
••
[...]
TL;DR: This work uses the term BGP Beacon to refer to a publicly documented prefix having global visibility and a published schedule for announcements and withdrawals, and describes several BGP Beacons that have been set up at various points in the Internet.
Abstract: The desire to better understand global BGP dynamics has motivated several studies using active measurement techniques, which inject announcements and withdrawals of prefixes from the global routing domain. From these one can measure quantities such as the BGP convergence time. Previously, the route injection infrastructure of such experiments has either been temporary in nature, or its use has been restricted to the experimenters. The routing research community would benefit from a permanent and public infrastructure for such active probes. We use the term BGP Beacon to refer to a publicly documented prefix having global visibility and a published schedule for announcements and withdrawals. A BGP Beacon is to be used for the ongoing study of BGP dynamics, and so should be supportedwith a long-term commitment. We describe several BGP Beacons thathave been set up at various points in the Internet. We then describe techniques for processing BGP updates when a BGP Beacon is observed from a BGP monitoring point such as Oregon's Route Views. Finally, we illustrate the use of BGP Beacons in the analysis of convergence delays, route flap damping, and update inter-arrival times.
107 citations
••
07 Jan 2008TL;DR: It is demonstrated that it is possible to generate a suite of useful data processing tools, including a semi-structured query engine, several format converters, a statistical analyzer and data visualization routines directly from the ad hoc data itself, without any human intervention.
Abstract: An ad hoc data source is any semistructured data source for which useful data analysis and transformation tools are not readily available. Such data must be queried, transformed and displayed by systems administrators, computational biologists, financial analysts and hosts of others on a regular basis. In this paper, we demonstrate that it is possible to generate a suite of useful data processing tools, including a semi-structured query engine, several format converters, a statistical analyzer and data visualization routines directly from the ad hoc data itself, without any human intervention. The key technical contribution of the work is a multi-phase algorithm that automatically infers the structure of an ad hoc data source and produces a format specification in the PADS data description language. Programmers wishing to implement custom data analysis tools can use such descriptions to generate printing and parsing libraries for the data. Alternatively, our software infrastructure will push these descriptions through the PADS compiler, creating format-dependent modules that, when linked with format-independent algorithms for analysis and transformation, result infully functional tools. We evaluate the performance of our inference algorithm, showing it scales linearlyin the size of the training data - completing in seconds, as opposed to the hours or days it takes to write a description by hand. We also evaluate the correctness of the algorithm, demonstrating that generating accurate descriptions often requires less than 5% of theavailable data.
107 citations
Authors
Showing all 1881 results
Name | H-index | Papers | Citations |
---|---|---|---|
Yoshua Bengio | 202 | 1033 | 420313 |
Scott Shenker | 150 | 454 | 118017 |
Paul Shala Henry | 137 | 318 | 35971 |
Peter Stone | 130 | 1229 | 79713 |
Yann LeCun | 121 | 369 | 171211 |
Louis E. Brus | 113 | 347 | 63052 |
Jennifer Rexford | 102 | 394 | 45277 |
Andreas F. Molisch | 96 | 777 | 47530 |
Vern Paxson | 93 | 267 | 48382 |
Lorrie Faith Cranor | 92 | 326 | 28728 |
Ward Whitt | 89 | 424 | 29938 |
Lawrence R. Rabiner | 88 | 378 | 70445 |
Thomas E. Graedel | 86 | 348 | 27860 |
William W. Cohen | 85 | 384 | 31495 |
Michael K. Reiter | 84 | 380 | 30267 |