scispace - formally typeset
A

Andrei Z. Broder

Researcher at Google

Publications -  241
Citations -  28441

Andrei Z. Broder is an academic researcher from Google. The author has contributed to research in topics: Web search query & Web page. The author has an hindex of 67, co-authored 241 publications receiving 27310 citations. Previous affiliations of Andrei Z. Broder include AmeriCorps VISTA & IBM.

Papers
More filters
Journal ArticleDOI

Syntactic clustering of the Web

TL;DR: An efficient way to determine the syntactic similarity of files is developed and applied to every document on the World Wide Web, and a clustering of all the documents that are syntactically similar is built.
Journal ArticleDOI

Min-Wise Independent Permutations

TL;DR: This research was motivated by the fact that such a family of permutations is essential to the algorithm used in practice by the AltaVista web index software to detect and filter near-duplicate documents.
Journal ArticleDOI

Balanced Allocations

TL;DR: It is shown that with high probability, the fullest box contains only ln ln n/ln 2 + O(1) balls---exponentially less than before and a similar gap exists in the infinite process, where at each step one ball, chosen uniformly at random, is deleted, and one ball is added in the manner above.
Book ChapterDOI

Identifying and Filtering Near-Duplicate Documents

TL;DR: The algorithm for filtering near-duplicate documents discussed here has been successfully implemented and has been used for the last three years in the context of the AltaVista search engine.
Proceedings ArticleDOI

Summary cache: a scalable wide-area Web cache sharing protocol

TL;DR: This paper proposes a new protocol called "Summary Cache"; each proxy keeps a summary of the URLs of cached documents of each participating proxy and checks these summaries for potential hits before sending any queries, which enables cache sharing among a large number of proxies.