Institution

AT&T Labs

Company•

About: AT&T Labs is a based out in . It is known for research contribution in the topics: Network packet & The Internet. The organization has 1879 authors who have published 5595 publications receiving 483151 citations.

...read moreread less

Topics: Network packet, The Internet, Server, Quality of service, Cellular network ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Fast Indexes and Algorithms for Set Similarity Selection Queries

[...]

Marios Hadjieleftheriou¹, A. Chandel¹, Nick Koudas², Divesh Srivastava³•Institutions (3)

AT&T Labs¹, University of Toronto², AT&T³

07 Apr 2008

TL;DR: This work focuses on weighted similarity functions like TF/IDF, and introduces variants that are well suited for set similarity selections in a relational database context that have special semantic properties that can be exploited to design very efficient index structures and algorithms for answering queries efficiently.

...read moreread less

Abstract: Data collections often have inconsistencies that arise due to a variety of reasons, and it is desirable to be able to identify and resolve them efficiently. Set similarity queries are commonly used in data cleaning for matching similar data. In this work we concentrate on set similarity selection queries: Given a query set, retrieve all sets in a collection with similarity greater than some threshold. Various set similarity measures have been proposed in the past for data cleaning purposes. In this work we concentrate on weighted similarity functions like TF/IDF, and introduce variants that are well suited for set similarity selections in a relational database context. These variants have special semantic properties that can be exploited to design very efficient index structures and algorithms for answering queries efficiently. We present modifications of existing technologies to work for set similarity selection queries. We also introduce three novel algorithms based on the Threshold Algorithm, that exploit the semantic properties of the new similarity measures to achieve the best performance in theory and practice.

...read moreread less

112 citations

Book Chapter•DOI•

Fully-Dynamic All-Pairs Shortest Paths: Faster and Allowing Negative Cycles

[...]

Mikkel Thorup¹•Institutions (1)

AT&T Labs¹

08 Jul 2004

TL;DR: A solution to the fully-dynamic all pairs shortest path problem for a directed graph with arbitrary weights allowing negative cycles by supporting each vertex update in O(n^2({\rm log} n + {\rm log^2}(\overline{m}/n))\) amortized time.

...read moreread less

Abstract: We present a solution to the fully-dynamic all pairs shortest path problem for a directed graph with arbitrary weights allowing negative cycles. We support each vertex update in \(O(n^2({\rm log} n + {\rm log^2}(\overline{m}/n)))\) amortized time. Here, n is the number vertices, m the number of edges and \(\overline{m} = n + m\). A vertex update inserts or deletes a vertex with all incident edges, and we update a complete distance matrix accordingly. The algorithm runs on a comparison-addition based pointer-machine.

...read moreread less

112 citations

Proceedings Article•DOI•

Partial-parallel-repair (PPR): a distributed technique for repairing erasure coded storage

[...]

Subrata Mitra¹, Rajesh Krishna Panta², Moo-Ryong Ra², Saurabh Bagchi¹•Institutions (2)

Purdue University¹, AT&T Labs²

18 Apr 2016

TL;DR: This paper proposes a novel distributed reconstruction technique, called Partial Parallel Repair (PPR), which divides the reconstruction operation to small partial operations and schedules them on multiple nodes already involved in the data reconstruction, and reduces repair time and degraded read time significantly.

...read moreread less

Abstract: With the explosion of data in applications all around us, erasure coded storage has emerged as an attractive alternative to replication because even with significantly lower storage overhead, they provide better reliability against data loss. Reed-Solomon code is the most widely used erasure code because it provides maximum reliability for a given storage overhead and is flexible in the choice of coding parameters that determine the achievable reliability. However, reconstruction time for unavailable data becomes prohibitively long mainly because of network bottlenecks. Some proposed solutions either use additional storage or limit the coding parameters that can be used. In this paper, we propose a novel distributed reconstruction technique, called Partial Parallel Repair (PPR), which divides the reconstruction operation to small partial operations and schedules them on multiple nodes already involved in the data reconstruction. Then a distributed protocol progressively combines these partial results to reconstruct the unavailable data blocks and this technique reduces the network pressure. Theoretically, our technique can complete the network transfer in ⌈(log2(k + 1))⌉ time, compared to k time needed for a (k, m) Reed-Solomon code. Our experiments show that PPR reduces repair time and degraded read time significantly. Moreover, our technique is compatible with existing erasure codes and does not require any additional storage overhead. We demonstrate this by overlaying PPR on top of two prior schemes, Local Reconstruction Code and Rotated Reed-Solomon code, to gain additional savings in reconstruction time.

...read moreread less

112 citations

Proceedings Article•DOI•

Supervised and unsupervised PCFG adaptation to novel domains

[...]

Brian Roark¹, Michiel Bacchiani¹•Institutions (1)

AT&T Labs¹

27 May 2003

TL;DR: This paper investigates adapting a lexicalized probabilistic context-free grammar (PCFG) to a novel domain, using maximum a posteriori (MAP) estimation, and shows F-measure parsing accuracy gains of as much as 2.5% for high accuracy lexicalization parsing through the use of out-of-domain treebanks.

...read moreread less

Abstract: This paper investigates adapting a lexicalized probabilistic context-free grammar (PCFG) to a novel domain, using maximum a posteriori (MAP) estimation. The MAP framework is general enough to include some previous model adaptation approaches, such as corpus mixing in Gildea (2001), for example. Other approaches falling within this framework are more effective. In contrast to the results in Gildea (2001), we show F-measure parsing accuracy gains of as much as 2.5% for high accuracy lexicalized parsing through the use of out-of-domain treebanks, with the largest gains when the amount of indomain data is small. MAP adaptation can also be based on either supervised or unsupervised adaptation data. Even when no in-domain treebank is available, unsupervised techniques provide a substantial accuracy gain over unadapted grammars, as much as nearly 5% F-measure improvement.

...read moreread less

112 citations

Proceedings Article•DOI•

Hardening soft information sources

[...]

William W. Cohen¹, Henry Kautz², David McAllester¹•Institutions (2)

AT&T Labs¹, University of Washington²

01 Aug 2000

TL;DR: This work formally model a soft database as a noisy version of some unknown hard database and forms hardening as an optimization problem and gives a nontrivial nearly linear time algorithm for nding a local optimum.

...read moreread less

Abstract: The web contains a large quantity of unstructured information. In many cases, it is possible to heuristically extract structured information, but the resulting databases are \soft": they contain inconsistencies and duplication, and lack unique, consistently-used object identi ers. Examples include large bibliographic databases harvested from raw scienti c papers or databases constructed by merging heterogeneous \hard" databases. Here we formally model a soft database as a noisy version of some unknown hard database. We then consider the hardening problem, i.e., the problem of inferring the most likely underlying hard database given a particular soft database. A key feature of our approach is that hardening is global | many sources of evidence for a given hard fact are taken into account. We formulate hardening as an optimization problem and give a nontrivial nearly linear time algorithm for nding a local optimum.

...read moreread less

112 citations

Collapse

Authors

Showing all 1881 results

Name	H-index	Papers	Citations
Yoshua Bengio	202	1033	420313
Scott Shenker	150	454	118017
Paul Shala Henry	137	318	35971
Peter Stone	130	1229	79713
Yann LeCun	121	369	171211
Louis E. Brus	113	347	63052
Jennifer Rexford	102	394	45277
Andreas F. Molisch	96	777	47530
Vern Paxson	93	267	48382
Lorrie Faith Cranor	92	326	28728
Ward Whitt	89	424	29938
Lawrence R. Rabiner	88	378	70445
Thomas E. Graedel	86	348	27860
William W. Cohen	85	384	31495
Michael K. Reiter	84	380	30267

Network Information

Related Institutions (5)

Microsoft

86.9K papers, 4.1M citations

94% related

Google

39.8K papers, 2.1M citations

38.6K papers, 1.3M citations

90% related

Hewlett-Packard

59.8K papers, 1.4M citations

89% related

Bell Labs

59.8K papers, 3.1M citations

88% related

Performance

Metrics

5,600

Papers

517,237

Citations

No. of papers from the Institution in previous years
Year	Papers
2022	5
2021	33
2020	69
2019	71
2018	100
2017	91