Papers published on a yearly basis
Papers
More filters
••
13 Aug 2012TL;DR: A first-of-its-kind and in-depth analysis of one of the largest IXPs worldwide based on nine months' worth of sFlow records collected at that IXP in 2011 suggests that these large IXPs can be viewed as a microcosm of the Internet ecosystem itself and argues for a re-assessment of the mental picture the community has about this ecosystem.
Abstract: The largest IXPs carry on a daily basis traffic volumes in the petabyte range, similar to what some of the largest global ISPs reportedly handle. This little-known fact is due to a few hundreds of member ASes exchanging traffic with one another over the IXP's infrastructure. This paper reports on a first-of-its-kind and in-depth analysis of one of the largest IXPs worldwide based on nine months' worth of sFlow records collected at that IXP in 2011.A main finding of our study is that the number of actual peering links at this single IXP exceeds the number of total AS links of the peer-peer type in the entire Internet known as of 2010! To explain such a surprisingly rich peering fabric, we examine in detail this IXP's ecosystem and highlight the diversity of networks that are members at this IXP and connect there with other member ASes for reasons that are similarly diverse, but can be partially inferred from their business types and observed traffic patterns. In the process, we investigate this IXP's traffic matrix and illustrate what its temporal and structural properties can tell us about the member ASes that generated the traffic in the first place. While our results suggest that these large IXPs can be viewed as a microcosm of the Internet ecosystem itself, they also argue for a re-assessment of the mental picture that our community has about this ecosystem.
278 citations
••
08 Sep 2001TL;DR: TAX is complete for relational algebra extended with aggregation, and can express most queries expressible in popular XML query languages, and forms the basis for the Timber XML database system currently under development by the authors.
Abstract: Querying XML has been the subject of much recent investigation. A formal bulk algebra is essential for applying database-style optimization to XML queries. We develop such an algebra, called TAX (Tree Algebra for XML), for manipulating XML data, modeled as forests of labeled ordered trees. Motivated both by aesthetic considerations of intuitiveness, and by efficient computability and amenability to optimization, we develop TAX as a natural extension of relational algebra, with a small set of operators. TAX is complete for relational algebra extended with aggregation, and can express most queries expressible in popular XML query languages. It forms the basis for the Timber XML database system currently under development by us.
278 citations
••
06 Jul 2002TL;DR: Algorithms which rerank the top N hypotheses from a maximum-entropy tagger, the application being the recovery of named-entity boundaries in a corpus of web data, using the voted perceptron algorithm.
Abstract: This paper describes algorithms which rerank the top N hypotheses from a maximum-entropy tagger, the application being the recovery of named-entity boundaries in a corpus of web data. The first approach uses a boosting algorithm for ranking problems. The second approach uses the voted perceptron algorithm. Both algorithms give comparable, significant improvements over the maximum-entropy baseline. The voted perceptron algorithm can be considerably more efficient to train, at some cost in computation on test examples.
277 citations
••
TL;DR: The investigation indicates that delay transmitter diversity with quaternary phase-shift keying (QPSK) modulation and adaptive antenna arrays provides a good quality of service (QoS) with low retransmission probability, while space-time coding transmitter diversity provides high peak data rates.
Abstract: Transmitter diversity and down-link beamforming can be used in high-rate data wireless networks with orthogonal frequency division multiplexing (OFDM) for capacity improvement. We compare the performance of delay, permutation and space-time coding transmitter diversity for high-rate packet data wireless networks using OFDM modulation. For these systems, relatively high block error rates, such as 10%, are acceptable assuming the use of effective automatic retransmission request (ARQ). As an alternative, we also consider using the same number of transmitter antennas for down-link beamforming as we consider for transmitter diversity. The investigation indicates that delay transmitter diversity with quaternary phase-shift keying (QPSK) modulation and adaptive antenna arrays provides a good quality of service (QoS) with low retransmission probability, while space-time coding transmitter diversity provides high peak data rates. Down-link beamforming together with adaptive antenna arrays, however, provides a higher capacity than transmitter diversity for typical mobile environments.
277 citations
••
01 Jun 2004TL;DR: A novel data streaming algorithm to provide much more accurate estimates of flow distribution, using a "lossy data structure" which consists of an array of counters fitted well into SRAM, which not only dramatically improves the accuracy offlow distribution measurement, but also contributes to the field of data streaming.
Abstract: Knowing the distribution of the sizes of traffic flows passing through a network link helps a network operator to characterize network resource usage, infer traffic demands, detect traffic anomalies, and accommodate new traffic demands through better traffic engineering. Previous work on estimating the flow size distribution has been focused on making inferences from sampled network traffic. Its accuracy is limited by the (typically) low sampling rate required to make the sampling operation affordable. In this paper we present a novel data streaming algorithm to provide much more accurate estimates of flow distribution, using a "lossy data structure" which consists of an array of counters fitted well into SRAM. For each incoming packet, our algorithm only needs to increment one underlying counter, making the algorithm fast enough even for 40 Gbps (OC-768) links. The data structure is lossy in the sense that sizes of multiple flows may collide into the same counter. Our algorithm uses Bayesian statistical methods such as Expectation Maximization to infer the most likely flow size distribution that results in the observed counter values after collision. Evaluations of this algorithm on large Internet traces obtained from several sources (including a tier-1 ISP) demonstrate that it has very high measurement accuracy (within 2%). Our algorithm not only dramatically improves the accuracy of flow distribution measurement, but also contributes to the field of data streaming by formalizing an existing methodology and applying it to the context of estimating the flow-distribution.
276 citations
Authors
Showing all 1881 results
Name | H-index | Papers | Citations |
---|---|---|---|
Yoshua Bengio | 202 | 1033 | 420313 |
Scott Shenker | 150 | 454 | 118017 |
Paul Shala Henry | 137 | 318 | 35971 |
Peter Stone | 130 | 1229 | 79713 |
Yann LeCun | 121 | 369 | 171211 |
Louis E. Brus | 113 | 347 | 63052 |
Jennifer Rexford | 102 | 394 | 45277 |
Andreas F. Molisch | 96 | 777 | 47530 |
Vern Paxson | 93 | 267 | 48382 |
Lorrie Faith Cranor | 92 | 326 | 28728 |
Ward Whitt | 89 | 424 | 29938 |
Lawrence R. Rabiner | 88 | 378 | 70445 |
Thomas E. Graedel | 86 | 348 | 27860 |
William W. Cohen | 85 | 384 | 31495 |
Michael K. Reiter | 84 | 380 | 30267 |