Institution

AT&T Labs

Company•

About: AT&T Labs is a based out in . It is known for research contribution in the topics: Network packet & The Internet. The organization has 1879 authors who have published 5595 publications receiving 483151 citations.

...read moreread less

Topics: Network packet, The Internet, Server, Quality of service, Cellular network ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Integrating conflicting data: the role of source dependence

[...]

Xin Luna Dong¹, Laure Berti-Equille², Divesh Srivastava¹•Institutions (2)

AT&T Labs¹, University of Rennes²

01 Aug 2009

TL;DR: This paper applies Bayesian analysis to decide dependence between sources and design an algorithm that iteratively detects dependence and discovers truth from conflicting information and extends the model by considering accuracy of data sources and similarity between values.

...read moreread less

Abstract: Many data management applications, such as setting up Web portals, managing enterprise data, managing community data, and sharing scientific data, require integrating data from multiple sources. Each of these sources provides a set of values and different sources can often provide conflicting values. To present quality data to users, it is critical that data integration systems can resolve conflicts and discover true values. Typically, we expect a true value to be provided by more sources than any particular false one, so we can take the value provided by the majority of the sources as the truth. Unfortunately, a false value can be spread through copying and that makes truth discovery extremely tricky. In this paper, we consider how to find true values from conflicting information when there are a large number of sources, among which some may copy from others.We present a novel approach that considers dependence between data sources in truth discovery. Intuitively, if two data sources provide a large number of common values and many of these values are rarely provided by other sources (e.g., particular false values), it is very likely that one copies from the other. We apply Bayesian analysis to decide dependence between sources and design an algorithm that iteratively detects dependence and discovers truth from conflicting information. We also extend our model by considering accuracy of data sources and similarity between values. Our experiments on synthetic data as well as real-world data show that our algorithm can significantly improve accuracy of truth discovery and is scalable when there are a large number of data sources.

...read moreread less

439 citations

Proceedings Article•DOI•

Dynamic authenticated index structures for outsourced databases

[...]

Feifei Li¹, Marios Hadjieleftheriou², George Kollios¹, Leonid Reyzin¹•Institutions (2)

Boston University¹, AT&T Labs²

27 Jun 2006

TL;DR: This work defines a variety of essential and practical cost metrics associated with ODB systems and looks at solutions that can handle dynamic scenarios, where owners periodically update the data residing at the servers, both for static and dynamic environments.

...read moreread less

Abstract: In outsourced database (ODB)systems the database owner publishes its data through a number of remote servers, with the goal of enabling clients at the edge of the network to access and query the data more efficiently. As servers might be untrusted or can be compromised, query authentication becomes an essential component of ODB systems. Existing solutions for this problem concentrate mostly on static scenarios and are based on idealistic properties for certain cryptographic primitives. In this work, first we define a variety of essential and practical cost metrics associated with ODB systems. Then, we analytically evaluate a number of different approaches, in search for a solution that best leverages all metrics. Most importantly, we look at solutions that can handle dynamic scenarios, where owners periodically update the data residing at the servers. Finally, we discuss query freshness, a new dimension in data authentication that has not been explored before. A comprehensive experimental evaluation of the proposed and existing approaches is used to validate the analytical models and verify our claims. Our findings exhibit that the proposed solutions improve performance substantially over existing approaches, both for static and dynamic environments.

...read moreread less

434 citations

Journal Article•DOI•

Fitting mixtures of exponentials to long-tail distributions to analyze network performance models

[...]

Anja Feldmann¹, Ward Whitt¹•Institutions (1)

AT&T Labs¹

01 Jan 1998-Performance Evaluation

TL;DR: An algorithm for approximating a long-tail distribution by a hyperexponential distribution (a finite mixture of exponentials) is developed, proving that, in prinicple, it is possible to approximate distributions from a large class, including the Pareto and Weibull distributions, arbitrarily closely by hyperexPonential distributions.

...read moreread less

434 citations

Journal Article•

Supertagging: an approach to almost parsing

[...]

Srinivas Bangalore¹, Aravind K. Joshi²•Institutions (2)

AT&T Labs¹, University of Pennsylvania²

01 Jun 1999-Computational Linguistics

TL;DR: Novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques are proposed.

...read moreread less

Abstract: In this paper, we have proposed novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated with rich descriptions (supertags) that impose complex constraints in a local context. The supertags are designed such that only those elements on which the lexical item imposes constraints appear within a given supertag. Further, each lexical item is associated with as many supertags as the number of different syntactic contexts in which the lexical item can appear. This makes the number of different descriptions for each lexical item much larger than when the descriptions are less complex, thus increasing the local ambiguity for a parser. But this local ambiguity can be resolved by using statistical distributions of supertag co-occurrences collected from a corpus of parses. We have explored these ideas in the context of the Lexicalized Tree-Adjoining Grammar (LTAG) framework. The supertags in LTAG combine both phrase structure information and dependency information in a single representation. Supertag disambiguation results in a representation that is effectively a parse (an almost parse), and the parser need "only" combine the individual supertags. This method of parsing can also be used to parse sentence fragments such as in spoken utterances where the disambiguated supertag sequence may not combine into a single structure.

...read moreread less

434 citations

Proceedings Article•DOI•

PrivBayes: private data release via bayesian networks

[...]

Jun Zhang¹, Graham Cormode², Cecilia M. Procopiuc³, Divesh Srivastava³, Xiaokui Xiao¹ - Show less +1 more•Institutions (3)

Nanyang Technological University¹, University of Warwick², AT&T Labs³

18 Jun 2014

TL;DR: PrivBayes, a differentially private method for releasing high-dimensional data that circumvents the curse of dimensionality, and introduces a novel approach that uses a surrogate function for mutual information to build the model more accurately.

...read moreread less

Abstract: Privacy-preserving data publishing is an important problem that has been the focus of extensive study. The state-of-the-art solution for this problem is differential privacy, which offers a strong degree of privacy protection without making restrictive assumptions about the adversary. Existing techniques using differential privacy, however, cannot effectively handle the publication of high-dimensional data. In particular, when the input dataset contains a large number of attributes, existing methods require injecting a prohibitive amount of noise compared to the signal in the data, which renders the published data next to useless.To address the deficiency of the existing methods, this paper presents PrivBayes, a differentially private method for releasing high-dimensional data. Given a dataset D, PrivBayes first constructs a Bayesian network N, which (i) provides a succinct model of the correlations among the attributes in D and (ii) allows us to approximate the distribution of data in D using a set P of low-dimensional marginals of D. After that, PrivBayes injects noise into each marginal in P to ensure differential privacy and then uses the noisy marginals and the Bayesian network to construct an approximation of the data distribution in D. Finally, PrivBayes samples tuples from the approximate distribution to construct a synthetic dataset, and then releases the synthetic data. Intuitively, PrivBayes circumvents the curse of dimensionality, as it injects noise into the low-dimensional marginals in P instead of the high-dimensional dataset D. Private construction of Bayesian networks turns out to be significantly challenging, and we introduce a novel approach that uses a surrogate function for mutual information to build the model more accurately. We experimentally evaluate PrivBayes on real data and demonstrate that it significantly outperforms existing solutions in terms of accuracy.

...read moreread less

433 citations

Collapse

Authors

Showing all 1881 results

Name	H-index	Papers	Citations
Yoshua Bengio	202	1033	420313
Scott Shenker	150	454	118017
Paul Shala Henry	137	318	35971
Peter Stone	130	1229	79713
Yann LeCun	121	369	171211
Louis E. Brus	113	347	63052
Jennifer Rexford	102	394	45277
Andreas F. Molisch	96	777	47530
Vern Paxson	93	267	48382
Lorrie Faith Cranor	92	326	28728
Ward Whitt	89	424	29938
Lawrence R. Rabiner	88	378	70445
Thomas E. Graedel	86	348	27860
William W. Cohen	85	384	31495
Michael K. Reiter	84	380	30267

Network Information

Related Institutions (5)

Microsoft

86.9K papers, 4.1M citations

94% related

Google

39.8K papers, 2.1M citations

38.6K papers, 1.3M citations

90% related

Hewlett-Packard

59.8K papers, 1.4M citations

89% related

Bell Labs

59.8K papers, 3.1M citations

88% related

Performance

Metrics

5,600

Papers

517,237

Citations

No. of papers from the Institution in previous years
Year	Papers
2022	5
2021	33
2020	69
2019	71
2018	100
2017	91