Topic
Locality-sensitive hashing
About: Locality-sensitive hashing is a research topic. Over the lifetime, 1894 publications have been published within this topic receiving 69362 citations.
Papers published on a yearly basis
Papers
More filters
••
29 Oct 2012
TL;DR: This paper presents an efficient alternating optimization to learn the hashing functions and the optimal kernel combination, and shows that the proposed method can achieve 11% and 34% performance gains over state-of-the-art methods.
Abstract: Hashing methods, which generate binary codes to preserve certain similarity, recently have become attractive in many applications like large scale visual search However, most of state-of-the-art hashing methods only utilize single feature type, while combining multiple features has been proved very helpful in image search In this paper we propose a novel hashing approach that utilizes the information conveyed by different features The multiple feature hashing can be formulated as a similarity preserving problem with optimal linearly-combined multiple kernels Such formulation is not only compatible with general types of data and diverse types of similarities indicated by different visual features, but also helpful to achieve fast training and search We present an efficient alternating optimization to learn the hashing functions and the optimal kernel combination Experimental results on two well-known benchmarks CIFAR-10 and NUS-WIDE show that the proposed method can achieve 11% and 34% performance gains over state-of-the-art methods
62 citations
••
TL;DR: A complete characterization of the probability distribution of the Directory size and depth is derived, and its implications on the design of the directory are studied.
Abstract: Extendible hashing is an attractive direct-access technique which has been introduced recently. It is characterized by a combination of database-size flexibility and fast direct access. This paper derives performance measures for extendible hashing, and considers their implecations on the physical database design. A complete characterization of the probability distribution of the directory size and depth is derived, and its implications on the design of the directory are studied. The expected input/output costs of various operations are derived, and the effects of varying physical design parameters on the expected average operating cost and on the expected volume are studied.
62 citations
••
30 Jun 2004TL;DR: This paper proposes a geometry-invariant image hashing scheme, which can be employed for content copy detection and tracing and exhaustive experimental results obtained from benchmark attacks have confirmed the performance of the proposed method.
Abstract: Due to the desired non-invasive property, non-data hiding (called media hashing here) is considered to be an alternative to achieve many applications previously accomplished with watermarking. Recently, media hashing techniques for content identification have been gradually emerging. However, none of them are really resistant against geometrical attacks. In this paper, our aim is to propose a geometry-invariant image hashing scheme, which can be employed for content copy detection and tracing. Our system is mainly composed of three components: (i) robust mesh extraction; (iii) mesh-based robust hash extraction; and (iii) hash matching for similarity measurement. Exhaustive experimental results obtained from benchmark attacks have confirmed the performance of the proposed method
62 citations
•
26 Jun 2012
TL;DR: The key idea is the bilinear form of the proposed hash functions, which leads to higher collision probability than the existing hyperplane hash functions when using random projections, which boosts the search performance over the random projection based solutions.
Abstract: Hyperplane hashing aims at rapidly searching nearest points to a hyperplane, and has shown practical impact in scaling up active learning with SVMs. Unfortunately, the existing randomized methods need long hash codes to achieve reasonable search accuracy and thus suffer from reduced search speed and large memory overhead. To this end, this paper proposes a novel hyperplane hashing technique which yields compact hash codes. The key idea is the bilinear form of the proposed hash functions, which leads to higher collision probability than the existing hyperplane hash functions when using random projections. To further increase the performance, we propose a learning based framework in which the bilinear functions are directly learned from the data. This results in short yet discriminative codes, and also boosts the search performance over the random projection based solutions. Large-scale active learning experiments carried out on two datasets with up to one million samples demonstrate the overall superiority of the proposed approach.
61 citations
••
TL;DR: A robust image hashing with dominant discrete cosine transform (DCT) coefficients is proposed that converts the input image to a normalized image, divides it into non-overlapping blocks, extracts dominant DCT coefficients in the first row/column of each block to construct feature matrices, and finally conducts matrix compression by calculating and quantifying column distances.
61 citations