A
Andrei Z. Broder
Researcher at Google
Publications - 241
Citations - 28441
Andrei Z. Broder is an academic researcher from Google. The author has contributed to research in topics: Web search query & Web page. The author has an hindex of 67, co-authored 241 publications receiving 27310 citations. Previous affiliations of Andrei Z. Broder include AmeriCorps VISTA & IBM.
Papers
More filters
Journal ArticleDOI
Syntactic clustering of the Web
TL;DR: An efficient way to determine the syntactic similarity of files is developed and applied to every document on the World Wide Web, and a clustering of all the documents that are syntactically similar is built.
Journal ArticleDOI
Min-Wise Independent Permutations
TL;DR: This research was motivated by the fact that such a family of permutations is essential to the algorithm used in practice by the AltaVista web index software to detect and filter near-duplicate documents.
Journal ArticleDOI
Balanced Allocations
TL;DR: It is shown that with high probability, the fullest box contains only ln ln n/ln 2 + O(1) balls---exponentially less than before and a similar gap exists in the infinite process, where at each step one ball, chosen uniformly at random, is deleted, and one ball is added in the manner above.
Book ChapterDOI
Identifying and Filtering Near-Duplicate Documents
TL;DR: The algorithm for filtering near-duplicate documents discussed here has been successfully implemented and has been used for the last three years in the context of the AltaVista search engine.
Proceedings ArticleDOI
Summary cache: a scalable wide-area Web cache sharing protocol
TL;DR: This paper proposes a new protocol called "Summary Cache"; each proxy keeps a summary of the URLs of cached documents of each participating proxy and checks these summaries for potential hits before sending any queries, which enables cache sharing among a large number of proxies.