Topic

Universal hashing

About: Universal hashing is a research topic. Over the lifetime, 1482 publications have been published within this topic receiving 69315 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•

[...]

Aristides Gionis¹, Piotr Indyk¹, Rajeev Motwani¹•Institutions (1)

Stanford University¹

07 Sep 1999

TL;DR: Experimental results indicate that the novel scheme for approximate similarity search based on hashing scales well even for a relatively large number of dimensions, and provides experimental evidence that the method gives improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition.

...read moreread less

Abstract: The nearestor near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image databases, document collections, time-series databases, and genome databases. Unfortunately, all known techniques for solving this problem fall prey to the \curse of dimensionality." That is, the data structures scale poorly with data dimensionality; in fact, if the number of dimensions exceeds 10 to 20, searching in k-d trees and related structures involves the inspection of a large fraction of the database, thereby doing no better than brute-force linear search. It has been suggested that since the selection of features and the choice of a distance metric in typical applications is rather heuristic, determining an approximate nearest neighbor should su ce for most practical purposes. In this paper, we examine a novel scheme for approximate similarity search based on hashing. The basic idea is to hash the points Supported by NAVY N00014-96-1-1221 grant and NSF Grant IIS-9811904. Supported by Stanford Graduate Fellowship and NSF NYI Award CCR-9357849. Supported by ARO MURI Grant DAAH04-96-1-0007, NSF Grant IIS-9811904, and NSF Young Investigator Award CCR9357849, with matching funds from IBM, Mitsubishi, Schlumberger Foundation, Shell Foundation, and Xerox Corporation. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 25th VLDB Conference, Edinburgh, Scotland, 1999. from the database so as to ensure that the probability of collision is much higher for objects that are close to each other than for those that are far apart. We provide experimental evidence that our method gives signi cant improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition. Experimental results also indicate that our scheme scales well even for a relatively large number of dimensions (more than 50).

...read moreread less

3,705 citations

Proceedings Article•DOI•

Locality-sensitive hashing scheme based on p-stable distributions

[...]

Mayur Datar¹, Nicole Immorlica², Piotr Indyk², Vahab Mirrokni²•Institutions (2)

Stanford University¹, Massachusetts Institute of Technology²

08 Jun 2004

TL;DR: A novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem under lp norm, based on p-stable distributions that improves the running time of the earlier algorithm and yields the first known provably efficient approximate NN algorithm for the case p<1.

...read moreread less

Abstract: We present a novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem under lp norm, based on p-stable distributions.Our scheme improves the running time of the earlier algorithm for the case of the lp norm. It also yields the first known provably efficient approximate NN algorithm for the case p

...read moreread less

3,109 citations

Proceedings Article•

Spectral Hashing

[...]

Yair Weiss¹, Antonio Torralba², Rob Fergus³•Institutions (3)

Hebrew University of Jerusalem¹, Massachusetts Institute of Technology², Courant Institute of Mathematical Sciences³

08 Dec 2008

TL;DR: The problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard and a spectral method is obtained whose solutions are simply a subset of thresholded eigenvectors of the graph Laplacian.

...read moreread less

Abstract: Semantic hashing[1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresholded eigenvectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigenfunctions of manifolds, we show how to efficiently calculate the code of a novel data-point. Taken together, both learning the code and applying it to a novel point are extremely simple. Our experiments show that our codes outperform the state-of-the art.

...read moreread less

2,641 citations

Journal Article•DOI•

New hash functions and their use in authentication and set equality

[...]

Mark N. Wegman¹, J. Lawrence Carter¹•Institutions (1)

IBM¹

01 Jun 1981-Journal of Computer and System Sciences

TL;DR: Several new classes of hash functions with certain desirable properties are exhibited, and two novel applications for hashing which make use of these functions are introduced, including a provably secure authentication technique for sending messages over insecure lines and the application of testing sets for equality.

...read moreread less

1,586 citations

Proceedings Article•DOI•

Supervised hashing with kernels

[...]

Wei Liu¹, Jun Wang², Rongrong Ji¹, Yu-Gang Jiang³, Shih-Fu Chang¹ - Show less +1 more•Institutions (3)

Columbia University¹, IBM², Fudan University³

16 Jun 2012

TL;DR: A novel kernel-based supervised hashing model which requires a limited amount of supervised information, i.e., similar and dissimilar data pairs, and a feasible training cost in achieving high quality hashing, and significantly outperforms the state-of-the-arts in searching both metric distance neighbors and semantically similar neighbors is proposed.

...read moreread less

Abstract: Recent years have witnessed the growing popularity of hashing in large-scale vision problems. It has been shown that the hashing quality could be boosted by leveraging supervised information into hash function learning. However, the existing supervised methods either lack adequate performance or often incur cumbersome model training. In this paper, we propose a novel kernel-based supervised hashing model which requires a limited amount of supervised information, i.e., similar and dissimilar data pairs, and a feasible training cost in achieving high quality hashing. The idea is to map the data to compact binary codes whose Hamming distances are minimized on similar pairs and simultaneously maximized on dissimilar pairs. Our approach is distinct from prior works by utilizing the equivalence between optimizing the code inner products and the Hamming distances. This enables us to sequentially and efficiently train the hash functions one bit at a time, yielding very short yet discriminative codes. We carry out extensive experiments on two image benchmarks with up to one million samples, demonstrating that our approach significantly outperforms the state-of-the-arts in searching both metric distance neighbors and semantically similar neighbors, with accuracy gains ranging from 13% to 46%.

...read moreread less

1,461 citations

Collapse

Network Information

Performance

Metrics

1,579

Papers

75,526

Citations

No. of papers in the topic in previous years
Year	Papers
2023	31
2022	62
2021	11
2020	12
2019	12
2018	38

Universal hashing

Papers published on a yearly basis

Papers

Trending Questions (9)

Network Information

Related Topics (5)

Performance

Metrics