scispace - formally typeset
Search or ask a question
Institution

AT&T Labs

Company
About: AT&T Labs is a based out in . It is known for research contribution in the topics: Network packet & The Internet. The organization has 1879 authors who have published 5595 publications receiving 483151 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A set of generalized Baum-Welch updates for factorial hidden Markov models that make use of the transition matrices of these models as a convex combination—or mixture—of simpler dynamical models are derived.
Abstract: We study Markov models whose state spaces arise from the Cartesian product of two or more discrete random variables. We show how to parameterize the transition matrices of these models as a convex combination—or mixture—of simpler dynamical models. The parameters in these models admit a simple probabilistic interpretation and can be fitted iteratively by an Expectation-Maximization (EM) procedure. We derive a set of generalized Baum-Welch updates for factorial hidden Markov models that make use of this parameterization. We also describe a simple iterative procedure for approximately computing the statistics of the hidden states. Throughout, we give examples where mixed memory models provide a useful representation of complex stochastic processes.

170 citations

Proceedings ArticleDOI
11 Jun 2007
TL;DR: These algorithms offer strong randomized estimation guarantees while using only sublinear space in the size of the stream(s), and rely on novel, concise streaming sketch synopses that extend conventional sketching ideas to the probabilistic streams setting.
Abstract: The management of uncertain, probabilistic data has recently emerged as a useful paradigm for dealing with the inherent unreliabilities of several real-world application domains, including data cleaning, information integration, and pervasive, multi-sensor computing. Unlike conventional data sets, a set of probabilistic tuples defines a probability distribution over an exponential number of possible worlds (i.e., "grounded", deterministic databases). This "possibleworlds" interpretation allows for clean query semantics but also raises hard computational problems for probabilistic database query processors. To further complicate matters, in many scenarios (e.g., large-scale process and environmental monitoring using multiple sensor modalities), probabilistic data tuples arrive and need to be processed in a streaming fashion; that is, using limited memory and CPU resources and without the benefit of multiple passes over a static probabilistic database. Such probabilistic data streams raise a host of new research challenges for stream-processing engines that, to date, remain largely unaddressed. In this paper, we propose the first space- and time-efficient algorithms for approximating complex aggregate queries (including, the number of distinct values and join/self-join sizes) over probabilistic data streams. Following the possible-worlds semantics, such aggregates essentially define probability distributions over the space of possible aggregation results, and our goal is to characterize such distributions through efficient approximations of their key moments (such as expectation and variance). Our algorithms offer strong randomized estimation guarantees while using only sublinear space in the size of the stream(s), and rely on novel, concise streaming sketch synopses that extend conventional sketching ideas to the probabilistic streams setting. Our experimental results verify the effectiveness of our approach.

170 citations

Proceedings ArticleDOI
27 Oct 2003
TL;DR: In this article, the authors consider how well traffic engineering works with estimated traffic matrices in the context of a specific task; namely that of optimizing network routing to minimize congestion, measured by maximum link-utilization.
Abstract: Traffic engineering and traffic matrix estimation are often treated as separate fields, even though one of the major applications for a traffic matrix is traffic engineering. In cases where a traffic matrix cannot be measured directly, it may still be estimated from indirect data (such as link measurements), but these estimates contain errors. Yet little thought has been given to the effects of inexact traffic estimates on traffic engineering. In this paper we consider how well traffic engineering works with estimated traffic matrices in the context of a specific task; namely that of optimizing network routing to minimize congestion, measured by maximum link-utilization. Our basic question is: how well is the real traffic routed if the routing is only optimized for an estimated traffic matrix? We compare against optimal routing of the real traffic using data derived from an operational tier-1 ISP. We find that the magnitude of errors in the traffic matrix estimate is not, in itself, a good indicator of the performance of that estimate in route optimization. Likewise, the optimal algorithm for traffic engineering given knowledge of the real traffic matrix is no longer the best with only the estimated traffic matrix as input. Our main practical finding is that the combination of a known traffic matrix estimation technique and a known traffic engineering technique can get close to the optimum in avoiding congestion for the real traffic. We even demonstrate stability in the sense that routing optimized on data from one day continued to perform well on subsequent days. This stability is crucial for the practical relevance to off-line traffic engineering, as it can be performed by ISPs today.

169 citations

Proceedings ArticleDOI
12 Aug 2007
TL;DR: It is shown that k-mins sketches can be derived from respective bottom-K sketches, which enables the use of bottom-k sketches with off-the-shelf k-min estimators, and develops and analyze data structures that incrementally construct bottom-c sketches and all-distances bottom- k sketches.
Abstract: A Bottom-sketch is a summary of a set of items with nonnegative weights that supports approximate query processing. A sketch is obtained by associating with each item in a ground set an independent random rank drawn from a probability distribution that depends on the weight of the item and including the k items with smallest rank value.Bottom-k sketches are an alternative to k-mins sketches[9], which consist of the k minimum ranked items in k independent rank assignments,and of min-hash [5] sketches, where hash functions replace random rank assignments. Sketches support approximate aggregations, including weight and selectivity of a subpopulation. Coordinated sketches of multiple subsets over the same ground set support subset-relation queries such as Jaccard similarity or the weight of the union. All-distances sketches are applicable for datasets where items lie in some metric space such as data streams (time) or networks. These sketches compactly encode the respective plain sketches of all neighborhoods of a location. These sketches support queries posed over time windows or neighborhoods and time/spatially decaying aggregates.An important advantage of bottom-k sketches, established in a line of recent work, is much tighter estimators for several basic aggregates. To materialize this benefit, we must adapt traditional k-mins applications to use bottom-k sketches. We propose all-distances bottom-k sketches and develop and analyze data structures that incrementally construct bottom-k sketches and all-distances bottom-k sketches.Another advantage of bottom-k sketches is that when the data is represented explicitly, they can be obtained much more efficiently than k-mins sketches. We show that k-mins sketches can be derived from respective bottom-k sketches, which enables the use of bottom-k sketches with off-the-shelf k-mins estimators. (In fact, we obtain tighter estimators since each bottom-k sketch is adistribution over k-mins sketches).

169 citations


Authors

Showing all 1881 results

NameH-indexPapersCitations
Yoshua Bengio2021033420313
Scott Shenker150454118017
Paul Shala Henry13731835971
Peter Stone130122979713
Yann LeCun121369171211
Louis E. Brus11334763052
Jennifer Rexford10239445277
Andreas F. Molisch9677747530
Vern Paxson9326748382
Lorrie Faith Cranor9232628728
Ward Whitt8942429938
Lawrence R. Rabiner8837870445
Thomas E. Graedel8634827860
William W. Cohen8538431495
Michael K. Reiter8438030267
Network Information
Related Institutions (5)
Microsoft
86.9K papers, 4.1M citations

94% related

Google
39.8K papers, 2.1M citations

91% related

Hewlett-Packard
59.8K papers, 1.4M citations

89% related

Bell Labs
59.8K papers, 3.1M citations

88% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20225
202133
202069
201971
2018100
201791